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METHODS OF DIAGNOSIS OF LUNG CANCER, COMPOSITIONS AND METHODS 
OF SCREENING FOR MODULATORS OF LUNG CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 

This application is related to USSN 60/284,770, filed April 18, 2001; USSN 
60/290,492, filed May 10, 2001; USSN 60/334,370, filed November 29, 2001; USSN 
60/339,245, filed November 9, 2001; USSN 60/350,666, filed November 13, 2001; and 
10 USSN 60/xxx,xxx, filed April 12, 2002 (Docket OMNI-002P); each of which is incorporated 
herein by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein expression 
15 profiles and nucleic acids, products, and antibodies thereto that are involved in lung cancer; 
and to the use of such expression profiles and compositions in diagnosis and therapy of lung 
cancer. The invention further relates to methods for identifying and using agents and/or 
targets that inhibit lung cancer or related conditions. 

20 BACKGROUND OF THE INVENTION 

Lung cancer is the second most commonly occurring cancer in the United States and 
is the leading cause of cancer-related death. It is estimated that there are over 160,000 new 
cases of lung cancer in the United States every year. Of those who are diagnosed with lung 
cancer, 86 percent will die within five years. Lung cancer is the most common visceral 

25 cancer in men and accounts for nearly one third of all cancer deaths in both men and women. 
In fact, lung cancer accounts for 7% of all deaths, due to any cause, in both men and women. 

Smoking is the primary cause of lung cancer, with more than 80% of lung cancers 
resulting from smoking. About 400 to 500 separate gaseous substances are present in the 
smoke of a non-filter cigarette. The most noteworthy substances include nitrogen oxides, 

30 hydrogen cyanide, formaldehyde, benzene, and toluene. The particles present in cigarette 
smoke contain at least 3,500 individual compounds such as nicotine, tobacco alkaloids 
(nornicotine, anatabine, anabasine), polycyclic aromatic hydrocarbons (e.g., benzo(a)pyrene, 
B(a)P), naphthalenes, aromatic amines, phenols, and tobacco-specific nitrosamines. 
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Tobacco-specific nitrosamines are formed during tobacco curing and processing, and 
are suspected of causing lung cancer in humans. In rodent studies, regardless of the where or 
how it is applied, the tobacco-specific nitrosamine known as NNK produces lung adenomas 
and lung adenocarcinomas. The tobacco-specific nitrosamine known as NNAL also produces 
5 lung adenocarcinomas in rodents. 

Many of the chemicals found in cigarette smoke also affect the nonsmoker inhaling 
"secondhand" or sidestream smoke. Indeed, the smoke inhaled by non-smokers has a 
chemical composition similar to the smoke inhaled by smokers, but, importantly, the 
concentrations of the carcinogenic tobacco-specific nitrosamines are present in higher 
10 concentrations in second hand smoke. For this and other reasons, "passive smoking" is an 
important cause of lung cancer, causing as many as 3,000 lung cancer deaths in nonsmokers 
each year. 

In addition to smoking, other factors thought to be causes of lung cancer include on- 
the-job exposure to carcinogens such as asbestos and uranium, exposure to chemical hazards 

15 such as radon, polycyclic aromatic hydrocarbons, chromium, nickel, and inorganic arsenic, 
genetic factors, and diet. 

Histological classification of various lung cancers define the types of cancer that 
begin in the lung. See, e.g., Travis, et al. (1999) Histological Typing of Lung and Pleural 
Tumours (International Histological Classification of Tumours, No 1. Four major cell types 

20 make up more than 88% of all primary lung neoplasms. These are: squamous or epidermoid 
carcinoma, small cell (also called oat cell) carcinoma, adenocarcinoma, and large cell (also 
called large cell anaplastic) carcinoma. The remainder include undifferentiated carcinomas, 
carcinoids, bronchial gland tumors, and other rarer types. The various cell types have 
different natural histories and responses to therapy, and, thus, a correct histologic diagnosis is 

25 the first step of effective treatment. 

Small cell lung cancer (SCLC) accounts for 18-25% of all lung cancers, and occurs 
less frequently than non-small cell lung cancers, and generally spread to distant organs more 
rapidly than non-small cell lung cancer. In general, at the time of presentation small cell lung 
cancers have already spread beyond the beyond the bounds where surgery and curative intent 

30 can be undertaken. Hoever, if identified early enough, these cancers are often responsive to 
chemotherapy and thoracic radiation treatment. 

Non-small cell lung cancers (NSCLC) are the more frequently occurring form of lung 
cancer. They comprise squamous cell carcinoma, adenocarcinoma, and large cell carcinoma 
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and account for more than 75% of all lung cancers. Non-small cell tumors that are localized 
at the time of presentation can sometimes be cured with surgery and/or radiotherapy, but 
usually are not identified until significant metastasis has occurred, which are typically not 
very responsive to surgical, chemotherapy, or radiation treatment.. 
5 The screening of asymptomatic persons at high risk for lung cancer has often proven 

ineffective. In general, only 5 to 15 percent of lung cancer patients have their disease 
detected while they are asymptomatic. Of course, early detection and treatment are critical 
factors in the fight against lung cancer. The average survival rate is 49% for those whose 
cancer is detected early, before the cancer has spread from the lung. Lung cancer often 

1 0 spreads outside of the lung, and it may have spread to the bones or brain by the time it is 
diagnosed. While the prognosis may be better for lung cancers that are detected early, 
because of the lack ofV effective curative treatments, early detection does not necessarily alter 
the total death rate from lung cancer. 

Thus, methods for diagnosis and prognosis of lung cancer and effective treatment of 

1 5 lung cancer would be desirable. Accordingly, provided herein are methods that can be used 
in diagnosis and prognosis of lung cancer. Further provided are methods that can be used to 
screen candidate therapeutic agents for the ability to modulate, e.g., treat, lung cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in lung disease and other metastatic cancers. 

20 

SUMMARY OF THE INVENTION 
The present invention provides nucleotide sequences of genes that are up- and down- 
regulated in lung cancer cells. Such genes are useful for diagnostic purposes, and also as 
targets for screening for therapeutic compounds that modulate lung cancer, such as 

25 antibodies. The methods of detecting nucleic acids of the invention or their encoded proteins 
can be used for a number of purposes. Examples include early detection of lung cancers, 
monitoring and early detection of relapse following treatment of lung cancers, monitoring 
response to therapy of lung cancers, determining prognosis of lung cancers, directing therapy 
of lung cancers, selecting patients for postoperative chemotherapy or radiation therapy, 

30 selecting therapy, determining tumor prognosis, treatment, or response to treatment, and early 
detection of precancerous lesions of the lung. Examples of benign or precancerous lesions 
include: atelectasis, emphysema, brochitis, chronic obstructive pulmonary disease, fibrosis, 
hypersensitivity pneumonitis (HP), interstitial pulmonary fibrosis (IPF), asthma, and 
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bronchiectasis. Other aspects of the invention will become apparent to the skilled artisan by 
the following description of the invention. 

In one aspect, the present invention provides a method of detecting a lung cancer- 
associated transcript in a cell from a patient, the method comprising contacting a biological 
5 sample from the patient with a polynucleotide that selectively hybridizes to a sequence at 
least 80% identical to a sequence as shown in Tables 1 A-16. Alternatively, the sample may 
be contacted with a specific binding reagent, e.g., antibody. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at least 
95% identical to a sequence as shown in Tables 1A-16. In another embodiment, the 
10 polynucleotide comprises a sequence as shown in Tables 1A-16. 

In one embodiment, the biological sample is a tissue sample, or a body fluid. In 
another embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent label. In 
one embodiment, the polynucleotide is immobilized on a solid surface. In one embodiment, 
15 the patient is undergoing a therapeutic regimen to treat lung cancer. In another embodiment, 
the patient is suspected of having lung cancer. In one embodiment, the patient is a primate, 
e.g., a human. 

In one embodiment, the method further comprises the step of amplifying nucleic acids 
before the step of contacting the biological sample with the polynucleotide. 

20 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated transcript in the biological sample by contacting the 
biological sample with a polynucleotide that selectively hybridizes to a sequence at least 80% 

25 identical to a sequence as shown in Tables 1 A-16, thereby monitoring the efficacy of the 
therapy. Or the sample may be evaluated for protein, e.g., contacting the sample with an 
antibody. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated transcript to a level of the lung cancer-associated 
30 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 
treatment. Or the sample may be evalated for comparison of protein. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
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biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated antibody in the biological sample by contacting the 
biological sample with a polypeptide encoded by a polynucleotide that selectively hybridizes 
to a sequence at least 80% identical to a sequence as shown in Tables 1A-16, wherein the 
5 polypeptide specifically binds to the lung cancer-associated antibody, thereby monitoring the 
efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (hi) comparing the 
level of the lung cancer-associated antibody to a level of the lung cancer-associated antibody 
in a biological sample from the patient prior to, or earlier in, the therapeutic treatment. 

10 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated polypeptide in the biological sample by contacting the 
biological sample with an antibody, wherein the antibody specifically binds to a polypeptide 

15 encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
to a sequence as shown in Tables 1 A- 16, thereby monitoring the efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 
polypeptide in a biological sample from the patient prior to, or earlier in, the therapeutic 

20 treatment. In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1A-16. In one embodiment, an 
expression vector or cell comprises the isolated nucleic acid. In one aspect, the present 
invention provides an isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1 A- 16. 

25 In another aspect, the present invention provides an antibody that specifically binds to 

an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1 A-16. In one embodiment, the antibody is conjugated to an 
effector component, e.g., a fluorescent label, a radioisotope or a cytotoxic chemical. In one 
embodiment, the antibody is an antibody fragment. In another embodiment, the antibody is 

30 humanized. 

In one aspect, the present invention provides a method of detecting lung cancer in a a 
patient, the method comprising contacting a biological sample from the patient with an 
antibody or protein as described herein. 
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In another aspect, the present invention provides a method of detecting antibodies 
specific to a lung cancer gene in a patient, the method comprising contacting a biological 
sample from the patient with a polypeptide encoded by a nucleic acid comprises a sequence 
from Tables 1A-16. 

5 In another aspect, the present invention provides a method for identifying a compound 

that modulates a lung cancer-associated polypeptide, the method comprising the steps of: (i) 
contacting the compound with a lung cancer-associated polypeptide, the polypeptide encoded 
by a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 
sequence as shown in Tables 1 A- 16; and (ii) determining the functional effect of the 

1 0 compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic effect, or a 
chemical effect. In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 
cell membrane. In another embodiment, the polypeptide is recombinant. In one 
embodiment, the functional effect is determined by measuring ligand binding to the 

15 polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
or another critical process of a lung cancer-associated cell to treat lung cancer in a patient, the 
method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. In one embodiment, the compound is an 
20 antibody. 

In another aspect, the present invention provides a drug screening assay comprising 
the steps of: (i) administering a test compound to a mammal having lung cancer or a cell 
isolated therefrom; (ii) comparing the level of gene expression of a polynucleotide that 
selectively hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 
25 1A-16 in a treated cell or mammal with the level of gene expression of the polynucleotide in 
a control cell or mammal, wherein a test compound that modulates the level of expression of 
the polynucleotide is a candidate for the treatment of lung cancer. 

In one embodiment, the control is a mammal with lung cancer or a cell therefrom that 
has not been treated with the test compound. In another embodiment, the control is a normal 
30 cell or mammal, or a non-malignant lung disease. 

In another aspect, the present invention provides a method for treating a mammal 
having lung cancer comprising administering a compound identified by the assay described 
herein. 
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In another aspect, the present invention provides a pharmaceutical composition for 
treating a mammal having lung cancer, the composition comprising a compound identified by 
the assay described herein and a physiologically acceptable excipient. 

5 DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the objects outlined above, the present invention provides novel 
methods for diagnosis and treatment of lung disease or cancer, as well as methods for 
screening for compositions which modulate lung cancer. "Treatment, monitoring, detection 
or modulation of lung disease or cancer" includes treatment, monitoring, detection, or 

10 modulation of lung disease in those patients who have lung disease (whether malignant or 

non-malignant, e.g., emphysema, bronchitis, or fibrosis) as well as patients with lung cancers 
in which gene expression from a gene in Tables 1A-16 is increased or decreased, indicating 
that the subject is more likely to have disease. In particular,while these targets are identified 
primarily from lung cancer samples, these same targets are likely to be similarly found in 

15 analyses of other medical conditions. These other conditions may result from similar 
pathological processes which affect similar tissues, e.g., lung cancer, small cell lung 
carcinoma (oat cell carcinoma), non-small cell carcinomas (e.g., squamous cell carcinoma, 
adenocarcinoma, large cell lung carcinoma, carcinoid, granulomatous), fibrosis (idiopathic 
pulmonary fibrosis (IPF), hypersensitivity pneumonitis (HP), interstitial pneumonitis, 

20 nonspecific idiopathic pneumonitis (NSIP)), chronic obstructive pulmonary disease (COPD, 
e.g., emphysema, chronic bronchitis), asthma, bronchiectasis, and esophageal cancer. See, 
e.g., the NCI webpage and USSN 60/347,349 and USSN 60/xxx,xxx (docket LFBR-001-1P, 
filed March 29, 2002), each of which is incorporated herein by reference. The treatment may 
be of lung cancer or related condition itself, or treatment of metastasis. 

25 In particular, identification of markers selectively expressed on these cancers allows 

for use of that expression in diagnostic, prognostic, or therapeutic methods. As such, the 
invention defines various compositions, e.g., nucleic acids, polypeptides, antibodies, and 
small molecule agonists/antagonists, which will be useful to selectively identify those 
markers. For example, therapeutic methods may take the form of protein therapeutics which 

30 use the marker expression for selective localization or modulation of function (for those 
markers which have a causative disease effect), for vaccines, identification of binding 
partners, or antagonism, e.g., using antisense or RNAi. The markers may be useful for 
molecular characterization of subsets of lung diseases, which subsets may actually require 
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very different treatments. Moreover, the markers may also be important in related diseases to 
the specific cancers, e.g., which affect similar tissues in non-malignant diseases, or have 
similar mechanisms of induction/maintenance. Metastatic processes or characteristics may 
also be targeted. Diagnostic and prognostic uses are made available, e.g., to subset related 
5 but distinct diseases, or to determine treatment strategy. The detection methods may be based 
upon nucleic acid, e.g., PCR or hybridization techniques, or protein, e.g., ELISA, imaging, 
IHC, etc. The diagnosis may be qualitative or quantitative, and may detect increases or 
decreases in expression levels. 

Tables 1A-16 provide unigene cluster identification numbers for the nucleotide 

10 sequence of genes that exhibit increased or decreased expression in lung cancer samples. The 
tables also provide an exemplar accession number that provides a nucleotide sequence that is 
part of the unigene cluster. In Table 1 A, genes marked as "target 1" or "target 2" are 
particularly useful as therapeutic targets. Genes marked as "target 3" are particularly useful 
as diagnostic markers. Genes marked as "chron" are upregulated in chronically diseased lung 

15 (e.g., emphysema, bronchitis, fibrosis) relative to lung tumors and normal tissue. In certain 
analyses, the ratio for the "chron" category was determined using the 70th percentile of 
chronically diseases lung samples divided by the 90th percentile of normal lung samples. 
The ratio for the targets was determined using the 70th percentile of lung tumor samples 
divided by the 90th percentile of normal lung samples. 

20 

Definitions 

The term "lung cancer protein" or "lung cancer polynucleotide" or "lung cancer- 
associated transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, 
mutants, and interspecies homologs that: (1) have a nucleotide sequence that has greater than 

25 about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 
92%>, 93%, 94%>, 95%, 96%>, 97%, 98%, or 99% or greater nucleotide sequence identity, 
preferably over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or 
more nucleotides, to a nucleotide sequence of or associated with a unigene cluster of Tables 
1A-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen 

30 comprising an amino acid sequence encoded by a nucleotide sequence of or associated with a 
unigene cluster of Tables 1A-16, and conservatively modified variants thereof; (3) 
specifically hybridize under stringent hybridization conditions to a nucleic acid sequence, or 
the complement thereof of Tables 1A-16 and conservatively modified variants thereof; or (4) 

8 
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have an amino acid sequence that has greater than about 60% amino acid sequence identity, 
65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 
or 99% or greater amino sequence identity, preferably over a region of over a region of at 
least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence 
5 encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 1A-16. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. A "lung cancer polypeptide" and a "lung cancer polynucleotide," include 
both naturally occurring or recombinant forms. 

10 A "full length" lung cancer protein or nucleic acid refers to a lung cancer polypeptide 

or polynucleotide sequence, or a variant thereof, that contains the elements normally 
contained in one or more naturally occurring, wild type lung cancer polynucleotide or 
polypeptide sequences. The "full length" may be prior to, or after, various stages of post- 
translational processing or splicing, including alternative splicing. 

15 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a lung cancer protein, polynucleotide, or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

20 archival materials, blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. 
Biological samples also include explants and primary and/or transformed cell cultures 
derived from patient tissues. A biological sample is typically obtained from a eukaryotic 
organism, most preferably a mammal such as a primate, e.g., chimpanzee or human; cow; 
dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or other mammal; or a bird; reptile; 

25 fish. Livestock and domestic animals are of interest. 

"Providing a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 

30 methods of the invention in vivo. Archival tissues or materials, having treatment or outcome 
history, will be particularly useful. 

The terms "identical" or percent "identity," in the context of two or more nucleic 
acids or polypeptide sequences, refer to two or more sequences or subsequences that are the 



9 



WO 02/086443 PCT/US02/12476 

same or have a specified percentage of amino acid residues or nucleotides that are the same 
(e.g., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%o, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
5 measured using, e.g., a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 
be "substantially identical." This definition also refers to, or may be applied to, the 
complement of a test sequence. The definition also includes sequences that have deletions 
10 and/or insertions, substitutions, and naturally occurring, e.g., polymorphic or allelic variants, 
and man-made variants. As described below, the preferred algorithms can account for gaps 
and the like. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or 
nucleotides in length. 

15 For sequence comparison, typically one sequence acts as a reference sequence, to 

which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 

20 comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
contiguous positions selected from the group consisting typically of from 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 150 in which a sequence may be 

25 compared to a reference sequence of the same number of contiguous positions after the two 
sequences are optimally aligned. Methods of alignment of sequences for comparison are 
well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Adv. Aovl. Math. 
2:482, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 

30 48:443, by the search for similarity method of Pearson and Lipman (1988) Proc. Nat'l. Acad. 
Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer 
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Group, 575 Science Dr., Madison, WI), or by manual alignment and visual inspection (see, 
e.g., Ausubel, et al. (eds. 1995 and supplements) Current Protocols in Molecular Biology . 

Preferred examples of algorithms that are suitable for determining percent sequence 
identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 
5 described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et al. (1990) 
J. Mol. Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 
herein, to determine percent sequence identity for the nucleic acids and proteins of the 
invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

10 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive- valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul, et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 

1 5 them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 
for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 

20 hits in each direction are halted when: the cumulative alignment score falls off by the 

quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 

25 uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N—4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

30 The BLAST algorithm also performs a statistical analysis of the similarity between 

two sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probability by which a match between 

11 



WO 02/086443 PCT7US02/12476 

two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 
is considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. Log values may be negative 
5 large numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 150, 170, etc. 

An indication that two nucleic acid sequences are substantially identical is that the 
polypeptide encoded by the first nucleic acid is immunologically cross reactive with the 
antibodies raised against the polypeptide encoded by the second nucleic acid. Thus, a 
polypeptide is typically substantially identical to a second polypeptide, e.g., where the two 
10 peptides differ only by conservative substitutions. Another indication that two nucleic acid 
sequences are substantially identical is that the two molecules or their complements hybridize 
to each other under stringent conditions. Yet another indication that two nucleic acid 
sequences are substantially identical is that the same primers can be used to amplify the 
sequences. 

15 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

20 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

25 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

30 that the nucleic acid or protein is at least about 85% pure, more preferably at least 95% pure, 
and most preferably at least 99% pure. "Purify" or "purification" in other embodiments 
means removing at least one contaminant or component from the composition to be purified. 
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In this sense, purification does not require that the purified compound be homogeneous, e.g., 
100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably herein to 
refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 
one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino acid, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, as well 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 
occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refer to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 
norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 
modified R groups (e.g., norleucine) or modified peptide backbones, but retain some basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refer to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that function similarly to another amino acid. 

Amino acids may be referred to herein by either their commonly known three letter 
symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic acid 
sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 
most proteins. For instance, the codons GCA, GCC, GCG, and GCU each encode the amino 
acid alanine. Thus, at each position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
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conservatively modified variations. Every nucleic acid sequence herein which encodes a 

polypeptide also describes silent variations of the nucleic acid. In certain contexts each 

codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and 

TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a 

5 functionally similar molecule. Accordingly, a silent variation of a nucleic acid which 

encodes a polypeptide is implicit in a described sequence with respect to the expression 

product, but not necessarily with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 

deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 

10 alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded 
sequence is a "conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically similar amino acid. Conservative substitution tables 
providing functionally similar amino acids are well known in the art. Such conservatively 
modified variants are in addition to and do not exclude polymorphic variants, interspecies 

15 homologs, and alleles of the invention. Typically conservative substitutions include for one 
another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine 
(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine 
(M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), 
Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)). 

20 Macromolecular structures such as polypeptide structures can be described in terms of 

various levels of organization. For a general discussion of this organization, see, e.g., 
Alberts, et al. (1994) Molecular Biology of the Cell (3 rd ed.) and Cantor and Schimmel (1980) 
Biophysical Chemistry Part I: The Conformation of Biological Macromolecules . "Primary 
structure" refers to the amino acid sequence of a particular peptide. "Secondary structure" 

25 refers to locally ordered, three dimensional structures within a polypeptide. These structures 
are commonly known as domains. Domains are portions of a polypeptide that often form a 
compact unit of the polypeptide and are typically 25 to approximately 500 amino acids long. 
Typical domains are made up of sections of lesser organization such as stretches of P-sheet 
and oc-helices. "Tertiary structure" refers to the complete three dimensional structure of a 

30 polypeptide monomer. "Quaternary structure" refers to the three dimensional structure 
formed, usually by the noncovalent association of independent tertiary units. Anisotropic 
terms are also known as energy terms. 
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"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical equivalents 

used herein means at least two nucleotides covalently linked together. Oligonucleotides are 

typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 

to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any 

5 length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, 

etc. A nucleic acid of the present invention will generally contain phosphodiester bonds, 

although in some cases, nucleic acid analogs are included that may have at least one different 

linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O- 

methylphophoroamidite linkages (see Eckstein (1992) Oligonucleotides and Analogues: A 

10 Practical Approach Oxford University Press); and peptide nucleic acid backbones and 
linkages. Other analog nucleic acids include those with positive backbones; non-ionic 
backbones, and non-ribose backbones, including those described in U.S. Patent Nos. 
5,235,033 and 5,034,506, and Chapters 6 and 7, in Sanghui and Cook, eds. Carbohydrate 
Modifications in Antisense Research . ASC Symposium Series 580. Nucleic acids containing 

15 one or more carbocyclic sugars are also included within one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to 
increase the stability and half-life of such molecules in physiological environments or as 
probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; 
alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring 

20 nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

25 kinetics. PNAs have larger changes in the melting temperature (T m ) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4° C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 

30 by cellular enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by those 
in the art, the depiction of a single strand also defines the sequence of the complementary 
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strand; thus the sequences described herein also provide the complement of the sequence. 
The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the 
nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations 
of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine 
5 hypoxanthine, isocytosine, isoguanine, etc. "Transcript" typically refers to a naturally 
occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g., the individual units of a peptide nucleic 

10 acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, physiological, chemical, or other physical 
means. For example, useful labels include 32 P, fluorescent dyes, electron-dense reagents, 
enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins 

15 or other entities which can be made detectable, e.g., by incorporating a radiolabel into the 

peptide or used to detect antibodies specifically reactive with the peptide. The labels may be 
incorporated into the cancer nucleic acids, proteins, and antibodies. Many methods known in 
the art for conjugating the antibody to the label may be employed, including those methods 
described by Hunter, et al. (1962) Nature 144:945; David, et al. (1974) Biochemistry 

20 13:1014-1021; Pain, et al. (198n J. Immunol. Meth. . 40:219-230; and Nygren (1982) L 
Histochem. and Cvtochem. 30:407-412. 

An "effector" or "effector moiety" or "effector component" is a molecule that is 
bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 

25 The "effector" can be a variety of molecules including, e.g., detection moieties including 

radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 

30 covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 

Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
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using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is a nucleic acid capable of 
binding to a target nucleic acid of complementary sequence through one or more types of 
5 chemical bonds, usually through complementary base pairing, e.g., through hydrogen bond 
formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or modified bases 
(7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a 
linkage other than a phosphodiester bond, preferably one that does not functionally interfere 
with hybridization. Thus, e.g., probes may be peptide nucleic acids in which the constituent 

10 bases are joined by peptide bonds rather than phosphodiester linkages. Probes may bind 

target sequences lacking complete complementarity with the probe sequence depending upon 
the stringency of the hybridization conditions. The probes are preferably directly labeled, 
e.g., with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled, e.g., with 
biotin to which a streptavidin complex may later bind. By assaying for the presence or 

15 absence of the probe, one can detect the presence or absence of the select sequence or 

subsequence. Diagnosis or prognosis may be based at the genomic level, or at the level of 
RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic acid, 
protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 

20 the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., recombinant 
cells express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 

25 originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 

polymerases and endonucleases, in a form not normally found in nature. In this manner, 
operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 

30 understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 



17 



WO 02/086443 PCT/US02/12476 

recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

The term "heterologous" when used with reference to portions of a nucleic acid 
5 indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 
coding region from another source. Similarly, a heterologous protein will often refer to two 
10 or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A "promoter" is typically an array of nucleic acid control sequences that direct 
transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase II type 

15 promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 
active under environmental or developmental regulation. The term "operably linked" refers 

20 to a functional linkage between a nucleic acid expression control sequence (such as a 

promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
e.g., wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 

25 synthetically, with a series of specified nucleic acid elements that permit transcription of a 

particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed in operable linkage to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 

30 duplexing, or hybridizing of a molecule selectively to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or RNA). 
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The phrase "stringent hybridization conditions" refers to conditions under which a 

probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 

acids, but to essentially no other sequences. Stringent conditions are sequence-dependent and 

will be different in different circumstances. Longer sequences hybridize specifically at 

5 higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 

"Overview of principles of hybridization and the strategy of nucleic acid assays" in Tijssen 

(1993) Techniques in Biochemistry and Molecular Biology— Hybridization with Nucleic 

Probes (vol. 24) Elsevier. Generally, stringent conditions are selected to be about 5-10° C 

lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength 

10 pH. The T m is the temperature (under defined ionic strength, pH, and nucleic concentration) 
at which 50% of the probes complementary to the target hybridize to the target sequence at 
equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are 
occupied at equilibrium). Stringent conditions will be those in which the salt concentration is 
less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or 

15 other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 
10 to 50 nucleotides) and at least about 60° C for long probes (e.g., greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as formamide. For selective or specific hybridization, a positive signal is 
typically at least two times background, preferably 10 times background hybridization. 

20 Exemplary stringent hybridization conditions are often: 50% formamide, 5x SSC, and 1% 
SDS, incubating at 42° C, or, 5x SSC, 1% SDS, incubating at 65° C, with wash in 0.2x SSC, 
and 0.1% SDS at 65° C. For PCR, a temperature of about 36° C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32° C and 48° C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

25 62° C is typical, although high stringency annealing temperatures can range from about 50° C 
to about 65° C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90° C - 95° C for 
0.5 - 2 min., an annealing phase lasting 0.5 - 2 min., and an extension phase of about 72° C 
for 1 - 2 min. Protocols and guidelines for low and high stringency amplification reactions 

30 are provided, e.g., in Innis, et al.(1990) PCR Protocols, A Guide to Methods and 
A pplications . 

Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides which they encode are substantially identical. This 
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occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy 

permitted by the genetic code. In such cases, the nucleic acids typically hybridize under 

moderately stringent hybridization conditions. Exemplary "moderately stringent 

hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 

5 1% SDS at 37° C, and a wash in IX SSC at 45° C. A positive hybridization is at least twice 

background. Alternative hybridization and wash conditions can be utilized to provide 

conditions of similar stringency. Additional guidelines for determining hybridization 

parameters are provided in numerous reference, e.g., Ausubel, et al. (ed.) Current Protocols in 

Molecular Biology Lippincott. 

10 The phrase "functional effects" in the context of assays for testing compounds that 

modulate activity of a lung cancer protein includes the determination of a parameter that is 
indirectly or directly under the influence of the lung cancer protein or nucleic acid, e.g., a 
physiological, enzymatic, functional, physical, or chemical effect, such as the ability to 
decrease lung cancer. It includes ligand binding activity; cell viability, cell growth on soft 

15 agar; anchorage dependence; contact inhibition and density limitation of growth; cellular 
proliferation; cellular transformation; growth factor or serum dependence; tumor specific 
marker levels; invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and 
protein expression in cells undergoing metastasis, and other characteristics of lung cancer 
cells. "Functional effects" include in vitro, in vivo, and ex vivo activities. 

20 By "determining the functional effect" is meant assaying for a compound that 

increases or decreases a parameter that is indirectly or directly under the influence of a lung 
cancer protein sequence, e.g., physiological, functional, enzymatic, physical, or chemical 
effects. Such functional effects can be measured by many means known to those skilled in 
the art, e.g., changes in spectroscopic characteristics (e.g., fluorescence, absorbance, 

25 refractive index), hydrodynamic (e.g., shape), chromatographic, or solubility properties for 
the protein, measuring inducible markers or transcriptional activation of the lung cancer 
protein; measuring binding activity or binding assays, e.g., binding to antibodies or other 
ligands, and measuring cellular proliferation. Determination of the functional effect of a 
compound on lung cancer can also be performed using lung cancer assays known to those of 

30 skill in the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage 

dependence; contact inhibition and density limitation of growth; cellular proliferation; 
cellular transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of lung cancer cells. The 

functional effects can be evaluated by many means known to those skilled in the art, e.g., 

microscopy for quantitative or qualitative measures of alterations in morphological features, 

measurement of changes in RNA or protein levels for lung cancer-associated sequences, 

5 measurement of RNA stability, identification of downstream or reporter gene expression 

(CAT, luciferase, p-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 

colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of lung cancer polynucleotide and 

polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules or 

10 compounds identified using in vitro and in vivo assays of lung cancer polynucleotide and 

polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or expression of lung cancer proteins, e.g., antagonists. Antisense or inhibitory 
nucleic acids may seem to inhibit expression and subsequent function of the protein. 

15 "Activators" are compounds that increase, open, activate, facilitate, enhance activation, 
sensitize, agonize, or up regulate lung cancer protein activity. Inhibitors, activators, or 
modulators also include genetically modified versions of lung cancer proteins, e.g., versions 
with altered activity, as well as naturally occurring and synthetic ligands, antagonists, 
agonists, antibodies, small chemical molecules and the like. Such assays for inhibitors and 

20 activators include, e.g., expressing the lung cancer protein in vitro, in cells, or cell 

membranes, applying putative modulator compounds, and then determining the functional 
effects on activity, as described above. Activators and inhibitors of lung cancer can also be 
identified by incubating lung cancer cells with the test compound and determining increases 
or decreases in the expression of 1 or more lung cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 

25 25, 30, 40, 50 or more lung cancer proteins, such as lung cancer proteins encoded by the 
sequences set out in Tables 1A-16. 

Samples or assays comprising lung cancer proteins that are treated with a potential 
activator, inhibitor, or modulator are compared to control samples without the inhibitor, 
activator, or modulator to examine the extent of inhibition. Control samples (untreated with 

30 inhibitors) are assigned a relative protein activity value of 100%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, preferably 50%, more 
preferably 25-0%. Activation of a lung cancer polypeptide is achieved when the activity 
value relative to the control (untreated with activators) is 1 10%, more preferably 150%, more 
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preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 

1000-3000% higher. 

The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as cell viability, formation of foci, 

5 anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and 

density limitation of growth, loss of growth factor or serum requirements, changes in cell 

morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney (1994) Culture of Animal Cells a Manual of 

10 Basic Technique pp. 231-241 (3 rd ed.). 

"Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells, or "transformation" in tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic material. Although transformation can arise from infection with a transforming virus 

15 and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 

spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney 
(1994) Culture of Animal Cells a Manual of Basic Technique (3 rd ed.)). 

20 "Antibody" refers to a polypeptide comprising a framework region from an 

immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 

25 gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 
Fundamental Immunology . 

An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each 

30 tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each 
chain defines a variable region of about 100 to 110 or more amino acids primarily responsible 
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for antigen recognition. The terms variable light chain (V L ) and variable heavy chain (V H ) 

refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized 

fragments produced by digestion with various peptidases. Thus, e.g., pepsin digests an 

5 antibody below the disulfide linkages in the hinge region to produce F(ab)' 2 , a dimer of Fab 

which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)' 2 may be 

reduced under mild conditions to break the disulfide linkage in the hinge region, thereby 

converting the F(ab)' 2 dimer into an Fab' monomer. The Fab 5 monomer is essentially Fab 

with part of the hinge region (see Paul (ed. 1999) Fundamental Immunology (4th ed.). While 

10 various antibody fragments are defined in terms of the digestion of an intact antibody, one of 
skill will appreciate that such fragments may be synthesized de novo either chemically or by 
using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes 
antibody fragments either produced by the modification of whole antibodies, or those 
synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those 

15 identified using phage display libraries (see, e.g., McCafferty, et al. (1990) Nature 348:552- 
554). 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) Immunology Today 4:72; Cole, et al. 

20 (1985), pp. 77-96 in Monoclonal Antibodies and Cancer Therapy : Coligan (1991 and 

supplements) Current Protocols in Immunology : Harlow and Lane (1988) Antibodies. A 
Laboratory Manual : and Goding (1986) Monoclonal Antibodies: Principles and Practice (2d 
ed.)). Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can 
be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 

25 other organisms such as other mammals, may be used to express humanized antibodies. 

Alternatively, phage display technology can be used to identify antibodies and heteromeric 
Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty, et al. (1990) 
Nature 348:552-554; Marks, et al. (1992) Biotechnology 10:779-783). 

A "chimeric antibody" is an antibody molecule in which, e.g, (a) the constant region, 

30 or a portion thereof, is altered, replaced, or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector 
function, and/or species, or an entirely different molecule which confers new properties to the 
chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the 
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variable region, or a portion thereof, is altered, replaced, or exchanged with a variable region 

having a different or altered antigen specificity. 



Identification of lung cancer-associated sequences 

5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis information is desired, to provide expression profiles. An 
expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 

10 characteristic of the state of the cell. That is, normal tissue may be distinguished from 
cancerous or metastatic cancerous tissue, or metastatic cancerous tissue can be compared 
with tissue from surviving cancer patients. By comparing expression profiles of tissue in 
known different lung cancer states, information regarding which genes are important 
(including both up- and down-regulation of genes) in each of these states is obtained. 

15 Molecular profiling may distinguish subtypes of a currently collective disease designation, 
e.g., different forms of lung cancer (chronic disease, adenocarcinoma, etc.) 

The identification of sequences that are differentially expressed in lung cancer versus 
non-lung cancer tissue allows the use of this information in a number of ways. For example, 
a particular treatment regime may be evaluated: does a chemo therap eutic drug act to down- 

20 regulate lung cancer, and thus tumor growth or recurrence, in a particular patient. 

Alternatively, a treatment step may induce other markers which may be used as targets to 
destroy tumor cells. Similarly, diagnosis and treatment outcomes may be done or confirmed 
by comparing patient samples with the known expression profiles. Malignant diseasemay be 
compared to non-malignant conditions. Metastatic tissue can also be analyzed to determine 

25 the stage of lung cancer in the tissue, or origin of primary tumor, e.g., metastasis from a 

remote primary site. Furthermore, these gene expression profiles (or individual genes) allow 
screening of drug candidates with an eye to mimicking or altering a particular expression 
profile; e.g., screening can be done for drugs that suppress the lung cancer expression profile. 
This may be done by making biochips comprising sets of the important lung cancer genes, 

30 which can then be used in these screens. PCR methods may be applied with selected primer 
pairs, and analysis may be of RNA or of genomic sequences. These methods can also be 
done on the protein basis; that is, protein expression levels of the lung cancer proteins can be 
evaluated for diagnostic purposes or to screen candidate agents. In addition, the lung cancer 
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nucleic acid sequences can be administered for gene therapy purposes, including the 
administration of antisense nucleic acids, or the lung cancer proteins (including antibodies 
and other modulators thereof) administered as therapeutic drugs or as protein or DNA 
vaccines. 

5 Thus the present invention provides nucleic acid and protein sequences that are 

differentially expressed in lung cancer relative to normal tissues and/or non-malignant lung 
disease, or in different types of lung disease, herein termed "lung cancer sequences." As 
outlined below, lung cancer sequences include those that are up-regulated (i.e., expressed at a 
higher level) in lung cancer, as well as those that are down-regulated (i.e., expressed at a 

10 lower level). In a preferred embodiment, the lung cancer sequences are from humans; 
however, as will be appreciated by those in the art, lung cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other lung 
cancer sequences are provided, from vertebrates, including mammals, including rodents (rats, 
mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, goats, pigs, cows, 

1 5 horses, etc.) and pets (dogs, cats, etc.). Lung cancer sequences from other organisms may be 
obtained using the techniques outlined below. 

Lung cancer sequences can include both nucleic acid and amino acid sequences. As 
will be appreciated by those in the art and is more fully outlined below, lung cancer nucleic 
acid sequences are useful in a variety of applications, including diagnostic applications, 

20 which will detect naturally occurring nucleic acids, as well as screening applications; e.g., 
biochips comprising nucleic acid probes or PGR microtiter plates with selected probes to the 
lung cancer sequences can be generated. 

A lung cancer sequence can be initially identified by substantial nucleic acid and/or 
amino acid sequence homology to the lung cancer sequences outlined herein. Such 

25 homology can be based upon the overall nucleic acid or amino acid sequence, and is 

generally determined as outlined below, e.g., using homology programs or hybridization 
conditions. 

For identifying lung cancer-associated sequences, the lung cancer screen typically 
includes comparing genes identified in different tissues, e.g., normal and cancerous tissues, 
30 cancer and non-malignant conditions, non-malignant conditions and normal tissues, or tumor 
tissue samples from patients who have metastatic disease vs. non metastatic tissue. Other 
suitable tissue comparisons include comparing lung cancer samples with metastatic cancer 
samples from other cancers, such as, breast, other gastrointestinal cancers, prostate, ovarian, 
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etc. Samples of, non metastatic disease tissue and tissue undergoing metastasis are applied to 

biochips comprising nucleic acid probes. The samples are first microdissected, if applicable, 
and treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commercially available, e.g., from Affymetrix, Santa Clara, CA. Gene expression profiles as 
5 described herein are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
lung, but also including, and not limited to colon, heart, brain, liver, breast, kidney, muscle, 
prostate, small intestine, large intestine, spleen, bone, and/or placenta. In a preferred 
10 embodiment, those genes identified during the lung cancer screen that are expressed in 
significant amounts in other tissues (e.g., essential organs) are removed from the profile, 
although in some embodiments, this is not necessary (e.g., where organs may be dispensible 
at a later stage of life). That is, when screening for drugs, it is usually preferable that the 
target expression be disease specific, to minimize possible side effects on other organs. 
15 In a preferred embodiment, lung cancer sequences are those that are up-regulated in 

lung cancer; that is, the expression of these genes is higher in cancerous tissue than in normal 
lung or other tissue. "Up-regulation" as used herein means, when the ratio is presented as a 
number greater than one, that the ratio is greater than one, preferably 1.5 or greater, more 
preferably 2.0 or greater. Another embodiment is directed to sequences up-regulated in non- 
20 malignant conditions relative to normal. Unigene cluster identification numbers and 

accession numbers herein are for the GenBank sequence database and the sequences of the 
accession numbers are hereby expressly incorporated by reference. GenBank is known in the 
art, see, e.g., Benson, DA, et al (1998) Nucleic Acids Research 26:1-7 and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
25 European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 
Another embodiment is directed to sequences up-regulated in non-malignant conditions 
relative to normal. In some situations, the sequences may be derived from assembly of 
available sequences or be predicted from genomic DNA using exon prediction algorithms, 
such as FGENESH (Salamov and Solovyev (2000) Genome Res. 10:516-522). la other 
30 situations, sequences have been derived from cloning and sequencing of isolated nucleic 
acids. 

In another preferred embodiment, lung cancer sequences are those that are down- 
regulated in the lung cancer; that is, the expression of these genes is lower in cancerous tissue 
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or normal lung or other tissue. "Down-regulation" as used herein means, when the ratio is 
presented as a number greater than one, that the ratio is greater than one, preferably 1.5 or 
greater, more preferably 2.0 or greater, or, when the ratio is presented as a number less than 
one, that the ratio is less than one, preferably 0.5 or less, more preferably 0.25 or less. 

5 

Informatics 

The ability to identify genes that are over or under expressed in lung cancer can 
additionally provide high-resolution, high-sensitivity datasets which can be used in the areas 
of diagnostics, therapeutics, drug development, pharmacogenetics, protein structure, 

10 biosensor development, and other related areas. For example, the expression profiles can be 
used in diagnostic or prognostic evaluation of patients with lung cancer. Or as another 
example, subcellular toxicological information can be generated to better direct drug structure 
and activity correlation (see Anderson (1998) Pharmaceutical Proteomics: Targets. 
Mechanism, and Function, paper presented at the IBC Proteomics conference, Coronado, CA 

15 (June 11-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 
saccharides, lipids, drugs, and the like). 

20 Thus, in another embodiment, the present invention provides a database that includes 

at least one set of assay data. The data contained in the database is acquired, e.g., using array 
analysis either singly or in a library format. The database can be in a form in which data can 
be maintained and transmitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on any electronic device allowing for the storage 

25 of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence data is for 
clarity of illustration only. It will be apparent to those of skill in the art that similar databases 
can be assembled for assay data acquired using an assay of the invention. 

30 The compositions and methods for identifying and/or quantitating the relative and/or 

absolute abundance of a variety of molecular and macromolecular species from a biological 
sample representing lung cancer, i.e., the identification of lung cancer-associated sequences 
described herein, provide an abundance of information, which can be correlated with 
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pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, gene- 
disease causal linkages, identification of correlates of immunity and physiological status, 
among others. Although the data generated from the assays of the invention is suited for 
manual review and analysis, in a preferred embodiment, data processing using high-speed 
5 computers is utilized. 

An array of methods for indexing and retrieving biomolecular information is known 
in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protein function hierarchies. U.S. 

10 Patent 5,953,727 discloses a relational database having sequence records containing 
information in a format that allows a collection of partial-length DNA sequences to be 
catalogued and searched according to association with one or more sequencing projects for 
obtaining full-length sequences from the collection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 

1 5 sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 

20 dimensional database comprising a functionality for multi-dimensional data analysis 
described as on-line analytical processing (OLAP), which entails the consolidation of 
projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 

25 fields stored in a hierarchical topological map which can be viewed as a tree structure or as 
the merger of two or more such tree structures. 

See also Mount, et al. (2001) Bioinformatics; Durbin, et al. (eds., 1999) Biological 
Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (; Baxevanis and 
Oeullette (eds., 1998) Bioinformatics: A Practical Guide to the Analys is of Genes and 

3 0 Proteins) ; Rashidi and Buehler ( 1 999) Bioinformatics: Basic Applications in Biological 

Science and Medicine : Setubal, et al. (eds 1997) Introduction to Comput ational Molecular 
Biology : Misener and Krawetz (eds, 2000) Bioinformatics: Methods and Protocols : Higgins 
and Taylor (eds., 2000) Bioinformatics: Sequence. Structure, and Databanks : A Practical 
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Approach : Brown (2001) Bioinfortnatics: A Biologist's Guide to Biocomputing and the 
Internet : Han and Kamber (2000) Data Mining: Concepts and Techniques (2000); and 
Waterman (1995) Introduction to Computational Biology: Maps, Sequences, and Genomes . 
The present invention provides a computer database comprising a computer and 
5 software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

In an exemplary embodiment, at least one of the sources of target-containing sample 
is from a control tissue sample known to be free of pathological disorders. In a variation, at 

10 least one of the sources is a known pathological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for lung cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 
sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) sample 

15 source; and (3) absolute and/or relative quantity of the target species present in the sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a computer data storage apparatus, which can include magnetic disks, optical disks, 
magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU registers and on-CPU 

20 data storage arrays. Typically, the target data records are stored as a bit pattern in an array of 
magnetic domains on a magnetizable medium or as an array of charge states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor 
and a charge storage area, which may be on the transistor). In one embodiment, the invention 
provides such storage devices, and computer systems built therewith, comprising a bit pattern 

25 encoding a protein expression fingerprint record comprising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides a 
method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 

30 or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
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be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM-compatible 
(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 
5 SunOS, Solaris, AIX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, 
Winchester) disk drive, comprising a bit pattern encoding data from an assay of the invention 
in a file format suitable for retrieval and processing in a computerized sequence analysis, 
comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing devices 

10 linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 

line, wireless network, optical fiber, or other suitable signal transmission medium, whereby at 
least one network device (e.g., computer, disk array, etc.) comprises a pattern of magnetic 
domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
composing a bit pattern encoding data acquired from an assay of the invention. 

15 The invention also provides a method for transmitting assay data that includes 

generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

20 In a preferred embodiment, the invention provides a computer system for comparing a 

query target to a database containing an array of data structures, such as an assay result 
obtained by the method of the invention, and ranking database targets based on the degree of 
identity and gap weight to the target data. A central processor is preferably initialized to load 
and execute the computer program for alignment and/or comparison of the assay results. 

25 Data for a query target is entered into the central processor via an I/O device. Execution of 
the computer program results in the central processor retrieving the assay data from the data 
file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to secondary 
memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 

30 SDRAM). Targets are ranked according to the degree of correspondence between a selected 
assay characteristic (e.g., binding to a selected affinity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For example, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
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MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or public domain 

molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a 

data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, 

SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device can 

5 be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal 

adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O 

device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
10 collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

1 5 Characteristics of lung cancer-associated proteins 

Lung cancer proteins of the present invention may be classified as secreted proteins, 
transmembrane proteins or intracellular proteins. In one embodiment, the lung cancer protein 
is an intracellular protein. Intracellular proteins may be found in the cytoplasm and/or in the 
nucleus. Intracellular proteins are involved in all aspects of cellular function and replication 

20 (including, e.g., signaling pathways); aberrant expression of such proteins often results in 
unregulated or disregulated cellular processes (see, e.g., Alberts (ed. 1994) Molecular 
Biology of the Cell (3d ed.). For example, many intracellular proteins have enzymatic 
activity such as protein kinase activity, protein phosphatase activity, protease activity, 
nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins also serve 

25 as docking proteins that are involved in organizing complexes of proteins, or targeting 

proteins to various subcellular localizations, and are involved in maintaining the structural 
integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence in the 
proteins of one or more structural motifs for which defined functions have been attributed. In 

30 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
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domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 

few, have been shown to mediate protein-protein interactions. Some of these may also be 

involved in binding to phospholipids or other second messengers. As will be appreciated by 

5 one of ordinary skill in the art, these motifs can be identified on the basis of amino acid 

sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

enzymatic potential of the molecule and/or molecules with which the protein may associate. 

One useful database is Pfam (protein families), which is a large collection of multiple 

sequence alignments and hidden Markov models covering many common protein domains. 

10 Versions are available via the internet from Washington University in St. Louis, the Sanger 

Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al (2000) 

Nuc. Acids Res. 28:263-266; Sonnhammer, et al. (1997) Proteins 28:405-420; Bateman, et al. 

(1999) Nuc. Acids Res. 27:260-262; and Sonnhammer, et al. (1998) Nuc. Acids Res. 26:320- 

322). 

15 In another embodiment, the lung cancer sequences are transmembrane proteins. 

Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may 
have an intracellular domain, an extracellular domain, or both. The intracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular proteins. For example, the intracellular domain may have enzymatic activity 

20 and/or may serve as a binding site for additional proteins. Frequently the intracellular 

domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

25 Transmembrane proteins may contain from one to many transmembrane domains. 

For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/threonine protein kinases contain a single transmembrane domain. 
However, various other proteins including channels, pumps, and adenylyl cyclases contain 
numerous transmembrane domains. Many important cell surface receptors such as G protein 

30 coupled receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they 
contain 7 membrane spanning regions. Characteristics of transmembrane domains include 
approximately 17 consecutive hydrophobic amino acids that may be followed by charged 
amino acids. Therefore, upon analysis of the amino acid sequence of a particular protein, the 
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localization and number of transmembrane domains within the protein may be predicted (see, 

e.g., PSORT web site http://psort.nibb.ac.jp/). 

The extracellular domains of transmembrane proteins are diverse; however, conserved 

motifs are found repeatedly among various extracellular domains. Conserved structure 

5 and/or functions have been ascribed to different extracellular motifs. Many extracellular 

domains are involved in binding to other molecules. In one aspect, extracellular domains are 

found on receptors. Factors that bind the receptor domain include circulating ligands, which 

may be peptides, proteins, or small molecules such as adenosine and the like. For example, 

growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 

10 cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, 
mitogenic factors, hormones, neurotrophic factors and the like. Extracellular domains also 
bind to cell-associated molecules. In this respect, they may mediate cell-cell interactions. 
Cell-associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol 
(GPI) anchor, or may themselves be transmembrane proteins. Extracellular domains may 

15 also associate with the extracellular matrix and contribute to the maintenance of the cell 
structure. 

Lung cancer proteins that are transmembrane are particularly preferred in the present 
invention as they are readily accessible targets for extracellular immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 

20 in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ or in histological analysis. Alternatively, antibodies can also label intracellular proteins, 
in which case analytical samples are typically permeablized to provide access to intracellular 
proteins. In addition, some membrane proteins can be processed to release a soluble protein, 
or to expose a residual fragment. Released soluble proteins may be useful diagnostic 

25 markers, processed residual protein fragments may be useful lung markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

30 In another embodiment, the lung cancer proteins are secreted proteins; the secretion of 

which can be either constitutive or regulated. These proteins may have a signal peptide or 
signal sequence that targets the molecule to the secretory pathway. Secreted proteins are 
involved in numerous physiological events; e.g., if circulating, they often serve to transmit 
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signals to various other cell types. The secreted protein may function in an autocrine manner 

(acting on the cell that secreted the factor), a paracrine manner (acting on cells in close 

proximity to the cell that secreted the factor), an endocrine manner (acting on cells at a 

distance, e.g., secretion into the blood stream), or exocrine (secretion, e.g., through a duct or 

5 to adjacent epithelial surface as sweat glands, sebaceous glands, pancreatic ducts, lacrimal 

glands, mammary glands, sax producing glands of the ear, etc.). Thus secreted molecules 

often find use in modulating or altering numerous aspects of physiology. Lung cancer 

proteins that are secreted proteins are particularly preferred in the present invention as they 

serve as good targets for diagnostic markers, e.g., for blood, plasma, serum, or stool tests. 

10 Those which are enzymes may be antibody or small molecule targets. Others may be useful 

as vaccine targets, e.g., via CTL mechanisms. 

Use of lung cancer nucleic acids 

As described above, lung cancer sequence is initially identified by substantial nucleic 
15 acid and/or amino acid sequence homology or linkage to the lung cancer sequences outlined 
herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, 
and is generally determined as outlined below, using either homology programs or 
hybridization conditions. Typically, linked sequences on a rhRNA are found on the same 
molecule. 

20 The lung cancer nucleic acid sequences of the invention, e.g., the sequences in Tables 

1A-16, can be fragments of larger genes, i.e., they are nucleic acid segments. "Genes" in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
herein, extended sequences, in either direction, of the lung cancer genes can be obtained, 

25 using techniques well known in the art for cloning either longer sequences or the full length 
sequences; see Ausubel, et al., supra. Much can be done by informatics and many sequences 
can be clustered to include multiple sequences corresponding to a single gene, e.g., systems 
such as UniGene (see, http://www.ncbi.nlm.nih.gov/UniGene/). 

Once a lung cancer nucleic acid is identified, it can be cloned and, if necessary, its 

30 constituent parts recombined to form the entire lung cancer nucleic acid coding regions or the 
entire mRNA sequence. Once isolated from its natural source, e.g., contained within a 
plasmid or other vector or excised therefrom as a linear nucleic acid segment, the 
recombinant lung cancer nucleic acid can be further-used as a probe to identify and isolate 
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other lung cancer nucleic acids, e.g., extended coding regions. It can also be used as a 

"precursor" nucleic acid to make modified or variant lung cancer nucleic acids and proteins. 

The lung cancer nucleic acids of the present invention are used in several ways. In a 

first embodiment, nucleic acid probes to the lung cancer nucleic acids are made and attached 

5 to biochips to be used in screening and diagnostic methods, as outlined below, or for 

administration, e.g., for gene therapy, RNAi, vaccine, and/or antisense applications. 

Alternatively, the lung cancer nucleic acids that include coding regions of lung cancer 

proteins can be put into expression vectors for the expression of lung cancer proteins, again 

for screening purposes or for administration to a patient. 

10 In a preferred embodiment, nucleic acid probes to lung cancer nucleic acids (both the 

nucleic acid sequences outlined in the figures and/or the complements thereof) are made. 
The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the lung cancer nucleic acids, i.e., the target sequence (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 

1 5 hybridization of the target sequence and the probes of the present invention occurs. As 

outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 

20 conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under appropriate reaction conditions, particularly high stringency 
conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single and 

25 partially double stranded. The strandedness of the probe is dictated by the structure, 

composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 
and from about 30 to about 50 bases being particularly preferred. That is, generally 
complements of ORFs or whole genes are not used. In some embodiments, nucleic acids of 

30 lengths up to hundreds of bases can be used. 

In a preferred embodiment, more than one probe per sequence is used, with either 
overlapping probes or probes to different sections of the target being used. That is, two, 
three, four or more probes, with three being preferred, are used to build in a redundancy for a 
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particular target. The probes can be overlapping (i.e., have some sequence in common), or 

separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 

immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

5 equivalents herein is meant the association or binding between the nucleic acid probe and the 

solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 

removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 

covalent binding" and grammatical equivalents herein is typically meant one or more of 

electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is 

10 the covalent attachment of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent binding" and 
grammatical equivalents herein is meant that the two moieties, the solid support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 
bonds. Covalent bonds can be formed directly between the probe and the solid support or can 

15 be formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to a biochip in a wide variety of ways, as will be 
appreciated by those in the art. As described herein, the nucleic acids can either be 

20 synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support" or 
other grammatical equivalents herein is meant a material that can be modified for the 
attachment or association of the nucleic acid probes and is amenable to at least one detection 

25 method. Often the substrate may contain discrete individual sites appropriate for ndivitual 
partitioning and identification. As will be appreciated by those in the art, the number of 
possible substrates are very large, and include, but are not limited to, glass and modified or 
functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and 
other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon, etc.), 

30 polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including 
silicon and modified silicon, carbon, metals, inorganic glasses, plastics, etc. Li general, the 
substrates allow optical detection and do not appreciably fluoresce. A preferred substrate is 
described in US application entitled Reusable Low Fluorescent Plastic Biochip, U.S. 
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Application Serial No. 09/270,214, filed March 15, 1999, herein incorporated by reference in 

its entirety. 

Generally the substrate is planar, although as will be appreciated by those in the art, 
other configurations of substrates may be used as well. For example, the probes may be 
5 placed on the inside surface of a tube, for flow-through sample analysis to minimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 
derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 

10 the biochip is derivatized with a chemical functional group including, but not limited to, 
amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g., using linkers as are known in the art; e.g., 

15 homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized, and then attached to the surface 
20 of the solid support. Either the 5' or 3' terminus may be attached to the solid support, or 

attachment may be via linkage to an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very strong, 

yet non-covalent. For example, biotinylated oligonucleotides can be made, which bind to 

surfaces covalently coated with streptavidin, resulting in attachment. 
25 Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 

the art. For example, photoactivation techniques utilizing photopolymerization compounds 

and techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in 

situ, using known photolithographic techniques, such as those described in WO 95/251 16; 

WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited within, all of 
30 which are expressly incorporated by reference; these methods of attachment form the basis of 

the Affymetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression level of 

lung cancer-associated sequences. These assays are typically performed in conjunction with 
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reverse transcription. In such assays, a lung cancer-associated nucleic acid sequence acts as a 
template in an amplification reaction (e.g., Polymerase Chain Reaction, or PGR). In a 
quantitative amplification, the amount of amplification product will be proportional to the 
amount of template in the original sample. Comparison to appropriate controls provides a 
5 measure of the amount of lung cancer-associated RNA. Methods of quantitative 

amplification are well known to those of skill in the art. Detailed protocols for quantitative 
PGR are provided, e.g., in Innis, et al. (1990) PGR Protocols. A Guide to Methods and 
Applications . 

In some embodiments, a TaqMan based assay is used to measure expression. 

10 TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
dye and a 3 5 quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3 ' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5 5 fluorescent dye and the 3' 

15 quenching agent, thereby resulting in an increase in fluorescence as a function of 

amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560, Landegren, et al. (1988) 
Science 241:1077, and Barringer, et al. (1990) Gene 89:117), transcription amplification 

20 (Kwoh, et al. (1989) Proc. Natl. Acad. Sci. USA 86:1 173), self-sustained sequence 

replication (Guatelli, et al. (1990) Proc. Nat. Acad. Sci. USA 87:1874), dot PCR, and linker 
adapter PCR, etc. 

Expression of lung caneer proteins from nucleic acids 

25 In a preferred embodiment, lung cancer nucleic acids, e.g., encoding lung cancer 

proteins, are used to make a variety of expression vectors to express lung cancer proteins 
which can then be used in screening assays, as described below. Expression vectors and 
recombinant DNA technology are well known to those of skill in the art (see, e.g., Ausubel, 
supra, and Fernandez and Hoeffler (eds 1999) Gene Expression Systems) and are used to 

30 express proteins. The expression vectors may be either self-replicating extrachromosomal 
vectors or vectors which integrate into a host genome. Generally, these expression vectors 
include transcriptional and translational regulatory nucleic acid operably linked to the nucleic 
acid encoding the lung cancer protein. The term "control sequences" refers to DNA 
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sequences used for the expression of an operably linked coding sequence in a particular host 

organism. Control sequences that are suitable for prokaryotes, e.g., include a promoter, 

optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are known to 

utilize promoters, polyadenylation signals, and enhancers. 

5 Nucleic acid is "operably linked" when it is placed into a functional relationship with 

another nucleic acid sequence. For example, DNA for a presequence or secretory leader is 

operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 

the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 

sequence if it affects the transcription of the sequence; or a ribosome binding site is operably 

10 linked to a coding sequence if it is positioned so as to facilitate translation. Generally, 

"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accomplished by ligation at convenient restriction 
sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in 

1 5 accordance with conventional practice. Transcriptional and translational regulatory nucleic 
acid will generally be appropriate to the host cell used to express the lung cancer protein. 
Numerous types of appropriate expression vectors, and suitable regulatory sequences are 
known in the art for a variety of host cells. 

In general, transcriptional and translational regulatory sequences may include, but are 

20 not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 

sequences, translational start and stop sequences, and enhancer or activator sequences. In a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences may be either constitutive or inducible promoters. The promoters 
25 may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
combine elements of more than one promoter, are also known in the art, and are useful in the 
present invention. 

In addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
30 organisms, e.g., in mammalian or insect cells for expression and in a prokaryotic host for 
cloning and amplification. Furthermore, for integrating expression vectors, the expression 
vector often contains at least one sequence homologous to the host cell genome, and 
preferably two homologous sequences which flank the expression construct. The integrating 
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vector may be directed to a specific locus in the host cell by selecting the appropriate 
homologous sequence for inclusion in the vector. Constructs for integrating vectors are well 
known in the art (e.g., Fernandez and Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a selectable 
5 marker gene to allow the selection of transformed host cells. Selection genes are well known 
in the art and will vary with the host cell used. 

The lung cancer proteins of the present invention are usually produced by culturing a 
host cell transformed with an expression vector containing nucleic acid encoding a lung 
cancer protein, under the appropriate conditions to induce or cause expression of the lung 

1 0 cancer protein. Conditions appropriate for lung cancer protein expression will vary with the 
choice of the expression vector and the host cell, and will be easily ascertained by one skilled 
in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 

15 growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and 
animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae 

20 and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, CI 29 cells, 293 cells, Neurospora, BHK, 
CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial cells), THP1 cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the lung cancer proteins are expressed in mammalian 
cells. Mammalian expression systems are also known in the art, and include retroviral and 

25 adenoviral systems. Of particular use as mammalian promoters are the promoters from 
mammalian viral genes, since the viral genes are often highly expressed and have a broad 
host range. Examples include the SV40 early promoter, mouse mammary tumor virus LTR 
promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV 
promoter (see, e.g., Fernandez and Hoeffler, supra). Typically, transcription termination and 

30 polyadenylation sequences recognized by mammalian cells are regulatory regions located 3' 
to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequence. Examples of transcription terminator and polyadenylation signals include those 
derived form SV40. 
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The methods of introducing exogenous nucleic acid into mammalian hosts, as well as 
other hosts, is well known in the art, and will vary with the host cell used. Techniques 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, viral infection, encapsulation of the 
5 polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

In a preferred embodiment, lung cancer proteins are expressed in bacterial systems. 
Promoters from bacteriophage may also be used and are known in the art. In addition, 
synthetic promoters and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of 
the trp and lac promoter sequences. Furthermore, a bacterial promoter can include naturally 

10 occurring promoters of non-bacterial origin that have the ability to bind bacterial RNA 
polymerase and initiate transcription. In addition to a functioning promoter sequence, an 
efficient ribosome binding site is desirable. The expression vector may also include a signal 
peptide sequence that provides for secretion of the lung cancer protein in bacteria. The 
protein is either secreted into the growth media (gram-positive bacteria) or into the 

15 periplasmic space, located between the inner and outer membrane of the cell (gram-negative 
bacteria). The bacterial expression vector may also include a selectable marker gene to allow 
for the selection of bacterial strains that have been transformed. Suitable selection genes 
include genes which render the bacteria resistant to drugs such as ampicillin, 
chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers 

10 also include biosynthetic genes, such as those in the histidine, tryptophan and leucine 

biosynthetic pathways. These components are assembled into expression vectors. Expression 
vectors for bacteria are well known in the art, and include vectors for Bacillus subtilis, E. 
coli, Streptococcus cremoris, and Streptococcus lividans, among others (e.g., Fernandez and 
Hoeffler, supra). The bacterial expression vectors are transformed into bacterial host cells 

15 using techniques well known in the art, such as calcium chloride treatment, electroporation, 
and others. 

In one embodiment, lung cancer proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art. 
50 In a preferred embodiment, lung cancer protein is produced in yeast cells. Yeast 

expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha, 
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Kluyveromyces fragilis and K. lactis, Pichia guillerimondii, and P. pastoris, 

Schizosaccharomyces pornbe, and Yarrowia lipolytica. 

The lung cancer protein may also be made as a fusion protein, using techniques well 

known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the desired epitope 

5 is small, the lung cancer protein may be fused to a carrier protein to form an immunogen. 

Alternatively, the lung cancer protein may be made as a fusion protein to increase expression 

for affinity purification purposes, or for other reasons. For example, when the lung cancer 

protein is a lung cancer peptide, the nucleic acid encoding the peptide may be linked to other 

nucleic acid for expression purposes. 

10 In a preferred embodiment, the lung cancer protein is purified or isolated after 

expression. Lung cancer proteins may be isolated or purified in a variety of appropriate 
ways. Standard purification methods include electrophoretic, molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the lung cancer protein 

1 5 may be purified using a standard anti-lung cancer protein antibody column. Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also useful. For 
general guidance in suitable purification techniques, see Scopes (1982) Protein Purification . 
The degree of purification necessary will vary depending on the use of the lung cancer 
protein. In some instances no purification will be necessary. 

20 Once expressed and purified if necessary, the lung cancer proteins and nucleic acids 

are useful in a number of applications. They may be used as immunoselection reagents, as 
vaccine reagents, as screening agents, therapeutic entities, for production of antibodies, as 
transcription or translation inhibitors, etc. 

25 Variants of lung cancer proteins 

In one embodiment, the lung cancer proteins are derivative or variant lung cancer 
proteins as compared to the wild-type sequence. That is, as outlined more fully below, the 
derivative lung cancer peptide will often contain at least one amino acid substitution, deletion 
or insertion, with amino acid substitutions being particularly preferred. The amino acid 
30 substitution, insertion or deletion may occur at a particular residue within the lung cancer 
peptide. 

Also included within one embodiment of lung cancer proteins of the present invention 
are amino acid sequence variants. These variants typically fall into one or more of three 
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classes: substitutional, insertional or deletional variants. These variants ordinarily are 

prepared by site specific mutagenesis of nucleotides in the DNA encoding the lung cancer 

protein, using cassette or PCR mutagenesis or other techniques, to produce DNA encoding 

the variant, and thereafter expressing the DNA in recombinant cell culture as outlined above. 

5 However, variant lung cancer protein fragments having up to about 100-150 residues may be 

prepared by in vitro synthesis. Amino acid sequence variants are characterized by the 

predetermined nature of the variation, a feature that sets them apart from naturally occurring 

allelic or interspecies variation of the lung cancer protein amino acid sequence. The variants 

typically exhibit a similar qualitative biological activity as the naturally occurring analogue, 

10 although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is often 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

1 5 conducted at the target codon or region and the expressed lung cancer variants screened for 
the optimal combination of desired activity. Techniques exist for making substitution 
mutations at predetermined sites in DNA having a known sequence, e.g., Ml 3 primer 
mutagenesis and PCR mutagenesis. Screening of mutants is often done using assays of lung 
cancer protein activities. 

20 Amino acid substitutions are typically of single residues; insertions usually will be on 

the order of from about 1 to 20 amino acids, although considerably larger insertions may be 
occasionally tolerated. Deletions generally range from about 1 to about 20 residues, although 
in some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to arrive 

25 at a final derivative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. Larger changes may be tolerated in certain circumstances. When 
small alterations in the characteristics of a lung cancer protein are desired, substitutions are 
generally made in accordance with the amino acid substitution chart provided in the 
definition section. 

30 Variants typically exhibit essentially the same qualitative biological activity and will 

elicit the same immune response as a naturally-occurring analog, although variants also are 
selected to modify the characteristics of lung cancer proteins as needed. Alternatively, the 
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variant may be designed or reorganized such that a biological activity of the lung cancer 

protein is altered. For example, glycosylation sites may be added, altered, or removed. 

Covalent modifications of lung cancer polypeptides are included within the scope of 

this invention. One type of covalent modification includes reacting targeted amino acid 

5 residues of a lung cancer polypeptide with an organic derivatizing agent that is capable of 

reacting with selected side chains or the N-or C-terminal residues of a lung cancer 

polypeptide. Derivatization with bifunctional agents is useful, for instance, for crosslinking 

lung cancer polypeptides to a water-insoluble support matrix or surface for use in a method 

for purifying anti-lung cancer polypeptide antibodies or screening assays, as is more fully 

10 described below. Commonly used crosslinking agents include, e.g., l,l-bis(diazoacetyl)-2- 
phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., esters with 4-azidosalicylic 
acid, homobifunctional imidoesters, including disuccinimidyl esters such as 3,3'- 
dithiobis(succinimidylpropionate), bifunctional maleimides such as bis-N-maleimido-1,8- 
octane and agents such as methyl-3-((p-azidophenyl)dithio)propipimidate. 

15 Other modifications include deamidation of glutaminyl and asparaginyl residues to 

the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and 
lysine, phosphorylation of hydroxyl groups of serinyl, threonyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, arginine, and histidine side chains (Creighton 
(1983) Proteins: Structure and Molecular Properties , pp. 79-86), acetylation of the N-terminal 

20 amine, and amidation of any C-terminal carboxyl group. 

Another type of covalent modification of the lung cancer polypeptide encompassed by 
this invention is an altered native glycosylation pattern of the polypeptide. "Altering the 
native glycosylation pattern" is intended herein to mean adding to or deleting one or more 
carbohydrate moieties of a native sequence lung cancer polypeptide. Glycosylation patterns 

25 can be altered in many ways. For example the use of different cell types to express lung 
cancer-associated sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to lung cancer polypeptides may also be accomplished 
by altering the amino acid sequence thereof. The alteration may be made, e.g., by the 
addition of, or substitution by, one or more serine or threonine residues to the native sequence 

30 lung cancer polypeptide (for O-linked glycosylation sites). The lung cancer amino acid 
sequence may optionally be altered through changes at the DNA level, particularly by 
mutating the DNA encoding the lung cancer polypeptide at preselected bases such that 
codons are generated that will translate into the desired amino acids. 
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Another means of increasing the number of carbohydrate moieties on the lung cancer 

polypeptide is by chemical or enzymatic coupling of glycosides to the polypeptide. Such 

methods are described in the art, e.g., in WO 87/05330, and in Aplin and Wriston (1981) 

CRC Crit. Rev. Biochem. , pp. 259-306. 

5 Removal of carbohydrate moieties present on the lung cancer polypeptide may be 

accomplished chemically or enzymatically or by mutational substitution of codons encoding 

for amino acid residues that serve as targets for glycosylation. Chemical deglycosylation 

techniques are known in the art and described, for instance, by Hakimuddin, et al. (1987) 

Arch. Biochem. Biophvs.. 259:52 and by Edge, et al. (1981) Anal. Biochem. , 118:131. 

10 Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 

variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 

Enzvmol. . 138:350. 

Another type of covalent modification of lung cancer comprises linking the lung 
cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., polyethylene 

15 glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Patent 
Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192, or 4,179,337. 

Lung cancer polypeptides of the present invention may also be modified in a way to 
form chimeric molecules comprising a lung cancer polypeptide fused to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fusion of a lung cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the lung cancer polypeptide. The 
presence of such epitope-tagged forms of a lung cancer polypeptide can be detected using an 
antibody against the tag polypeptide. Also, provision of the epitope tag enables the lung 

25 cancer polypeptide to be readily purified by affinity purification using an anti-tag antibody or 
another type of affinity matrix that binds to the epitope tag. In an alternative embodiment, 
the chimeric molecule may comprise a fusion of a lung cancer polypeptide with an 
immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of the 
chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

30 Various tag polypeptides and their respective antibodies are well known and examples 

include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; HIS 6 and metal 
chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field, et al. (1988) Mol. 
Cell. Biol. 8:2159-2165); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 antibodies 
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thereto (Evan, et al. (1985) Molecular and Cellular Biology 5:3610-3616); and the Herpes 

Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky, et al. (1990) Protein 

Engineering 3(6):547-553). Other tag polypeptides include the Flag-peptide (Hopp, et al. 

(1988) BioTechnologv 6:1204-1210); the KT3 epitope peptide (Martin, et al. (1992) Science 

5 255:192-194); tubulin epitope peptide (Skinner, et al. (1991) J. Biol. Chem. 266:15163- 

15166); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth, et al. (1990) Proc. Nat'l 

Acad. Sci. USA 87:6393-6397). 

Also included are other lung cancer proteins of the lung cancer family, and lung 

cancer proteins from other organisms, which are cloned and expressed as outlined below. 

10 Thus, probe or degenerate polymerase chain reaction (PCR) primer sequences may be used to 
find other related lung cancer proteins from primates or other organisms. As will be 
appreciated by those in the art, particularly useful probe and/or PCR primer sequences 
include unique areas of the lung cancer nucleic acid sequence. As is generally known in the 
art, preferred PCR primers are from about 15 to about 35 nucleotides in length, with from 

15 about 20 to about 30 being preferred, and may contain inosine as needed. PCR reaction 
conditions are well known in the art (e.g., Innis, PCR Protocols, supra). 

Antibodies to lung cancer proteins 

In a preferred embodiment, when a lung cancer protein is to be used to generate 
20 antibodies, e.g., for immunotherapy or immunodiagnosis, the lung cancer protein should 
share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller lung cancer protein will be able to bind to the full-length protein, 
25 particularly linear epitopes. In a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are well known (e.g., Coligan, supra; and 
Harlow and Lane, supra). Polyclonal antibodies can be raised in a mammal, e.g., by one or 
more injections of an immunizing agent and, if desired, an adjuvant. Typically, the 
30 immunizing agent and/or adjuvant will be injected in the mammal by multiple subcutaneous 
or intraperitoneal injections. The immunizing agent may include a protein encoded by a 
nucleic acid of Tables 1 A- 16 or fragment thereof or a fusion protein thereof. It may be useful 
to conjugate the immunizing agent to a protein known to be immunogenic in the mammal 
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being immunized. Immunogenic proteins include, e.g., keyhole limpet hemocyanin, serum 
albumin, bovine thyroglobulin, and soybean trypsin inhibitor. Adjuvants include, e.g., 
Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic 
trehalose dicorynomy colate) . The immunization protocol may be selected by one skilled in 
5 the art. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce 
10 or are capable of producing antibodies that will specifically bind to the immunizing agent. 
Alternatively, the lymphocytes may be immunized in vitro. The immunizing agent will 
typically include a polypeptide encoded by a nucleic acid of the tables, or fragment thereof, 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes ("PBLs") are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if non- 
15 human mammalian sources are desired. The lymphocytes are then fused with an 

immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 
hybridoma cell (Goding (1986) Monoclonal Antibodies: Principles and Practice , pp. 59-103 ). 
Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells 
of rodent, bovin, or primate origin. Usually, rat or mouse myeloma cell lines are employed. 
20 The hybridoma cells may be cultured in a suitable culture medium that preferably contains 
one or more substances that inhibit the growth or survival of the unfosed, immortalized cells. 
For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl 
transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include 
hypoxanthine, aminopterin, and thymidine ("HAT medium"), which substances prevent the 
25 growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
typically monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
30 protein encoded by a nucleic acid of the tables or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 
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In a preferred embodiment, the antibodies to lung cancer protein are capable of 
reducing or eliminating a biological function of a lung cancer protein, in a naked form or 
conjugated to an effector moiety. That is, the addition of anti-lung cancer protein antibodies 
(either polyclonal or preferably monoclonal) to lung cancer tissue (or cells containing lung 
5 cancer) may reduce or eliminate the lung cancer. Generally, at least a 25% decrease in 
activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the lung cancer proteins are humanized 
antibodies (e.g., Xenerex Biosciences, Medarex, Inc., Abgenix, Inc., Protein Design Labs, 

10 Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric molecules of 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', 
F(ab')2 or other antigen-binding subsequences of antibodies) which contain minimal 
sequence derived from non-human immunoglobulin. Humanized antibodies include human 
immunoglobulins (recipient antibody) in which residues from a complementary determining 

1 5 region (CDR) of the recipient are replaced by residues from a CDR of a non-human species 
(donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and 
capacity. In some instances, Fv framework residues of a human immunoglobulin are 
replaced by corresponding non-human residues. Humanized antibodies may also comprise 
residues which are found neither in the recipient antibody nor in the imported CDR or 

20 framework sequences. In general, a humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of 
the framework (FR) regions are those of a human immunoglobulin consensus sequence. A 
humanized antibody optimally also will typically comprise at least a portion of an 

25 immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones, et 
al. (1986) Nature 321:522-525; Riechmann, etal. (1988) Nature 332:323-329; andPresta 
(1992) Curr. Op. Struct. Biol. 2:593-596). Humanization can be performed following the 
method of Winter and co-workers (Jones, et al. (1986) Nature 321 :522-525; Riechmann, et al. 
(1988) Nature 332:323-327; Verhoeyen, et al. (1988) Science 239:1534-1536), by 

30 substituting rodent CDRs or CDR sequences for corresponding sequences of a human 

antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Patent No. 
4,816,567), wherein substantially less than an intact human variable domain has been 
substituted by corresponding sequence from a non-human species. 



48 



WO 02/086443 PCT/US02/12476 
Human-like antibodies can also be produced using various techniques known in the 

art, including phage display libraries (Hoogenboom and Winter (1991) J. Mol. Bio l. 227:381; 

Marks, et al. (1991) J. Mol. Biol. 222:581). The techniques of Cole, et al. and Boerner, et al. 

are also available for the preparation of human monoclonal antibodies (Cole, et al. (1985) 

5 Monoclonal Antibodies and Cancer Therapy , p. 77 and Boerner, et al. (1991) J. Immunol. 

147(l):86-95). Similarly, human antibodies can be made by introducing human 

immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

immunoglobulin genes have been partially or completely inactivated. Upon challenge, 

human antibody production is observed, which closely resembles that seen in humans in 

10 nearly all respects, including gene rearrangement, assembly, and antibody repertoire. This 
approach is described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in the following scientific publications: Marks, et al. (1992) 
Bio/Technology 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Morrison (1994) 
Nature 368:812-13; Fishwild, et al. (1996) Nature Biotechnology 14:845-51; Neuberger 

15 (1996) Nature Biotechnology 14:826; and Lonberg and Huszar (1995) Intern. Rev. Immunol. 
13:65-93. 

By immunotherapy is meant treatment of lung cancer with an antibody raised against 
a lung cancer proteins. As used herein, immunotherapy can be passive or active. Passive 
immunotherapy as defined herein is the passive transfer of antibody to a recipient (patient). 

20 Active immunization is the induction of antibody and/or T-cell responses in a recipient 

(patient). Induction of an immune response is the result of providing the recipient with an 
antigen to which antibodies are raised. The antigen may be provided by injecting a 
polypeptide against which antibodies are desired to be raised into a recipient, or contacting 
the recipient with a nucleic acid capable of expressing the antigen and under conditions for 

25 expression of the antigen, leading to an immune response. 

In a preferred embodiment the lung cancer proteins against which antibodies are 
raised are secreted proteins as described above. Without being bound by theory, antibodies 
used for treatment, may bind and prevent the secreted protein from binding to its receptor, 
thereby inactivating the secreted lung cancer protein. 

30 In another preferred embodiment, the lung cancer protein to which antibodies are 

raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment may bind the extracellular domain of the lung cancer protein and prevent it from 
binding to other proteins, such as circulating ligands or cell-associated molecules. The 
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antibody may cause down-regulation of the transmembrane lung cancer protein. The 

antibody may be a competitive, non-competitive or uncompetitive inhibitor of protein binding 

to the extracellular domain of the lung cancer protein. The antibody may be an antagonist of 

the lung cancer protein or may prevent activation of a transmembrane lung cancer protein, or 

5 may induce or suppress a particular cellular pathway. In some embodiments, when the 

antibody prevents the binding of other molecules to the lung cancer protein, the antibody 

prevents growth of the cell. The antibody may also be used to target or sensitize the cell to 

cytotoxic agents, including, but not limited to TNF-oc, TNF-p, IL-1, INF-y, and IL-2, or 

chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, methotrexate, 

10 and the like. In some instances the antibody may belong to a sub-type that activates serum 

complement when complexed with the transmembrane protein thereby mediating cytotoxicity 
or antigen-dependent cytotoxicity (ADCC). Thus, lung cancer may be treated by 
administering to a patient antibodies directed against the transmembrane lung cancer protein. 
Antibody-labeling may activate a co-toxin, localize a toxin payload, or otherwise provide 

15 means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be various molecules, including labeling moieties such as radioactive 
labels or fluorescent labels, or can be a therapeutic moiety. In one aspect the therapeutic 
moiety is a small molecule that modulates the activity of a lung cancer protein. In another 

20 aspect the therapeutic moiety may modulate an activity of molecules associated with or in 
close proximity to a lung cancer protein. The therapeutic moiety may inhibit enzymatic or 
signaling activity such as protease or collagenase activity associated with lung cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent. In 
this method, targeting the cytotoxic agent to lung cancer tissue or cells results in a reduction 

25 in the number of afflicted cells, thereby reducing symptoms associated with lung cancer. 

Cytotoxic agents are numerous and varied and include, but are not limited to, cytotoxic drugs 
or toxins or active fragments of such toxins. Suitable toxins and their corresponding 
fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A chain, curcin, 
crotin, phenomycin, enomycin, saporin, auristatin, and the like. Cytotoxic agents also include 

30 radiochemicals made by conjugating radioisotopes to antibodies raised against lung cancer 
proteins, or binding of a radionuclide to a chelating agent that has been covalently attached to 
the antibody. Targeting the therapeutic moiety to transmembrane lung cancer proteins not 
only serves to increase the local concentration of therapeutic moiety in the lung cancer 
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afflicted area, but also serves to reduce deleterious side effects that may be associated with 
the untargeted therapeutic moiety. 

In another preferred embodiment, the lung cancer protein against which the antibodies 
are raised is an intracellular protein. In this case, the antibody may be conjugated to a protein 
5 or other entity which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 
the individual or cell. Moreover, wherein the lung cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody theretomay contain a signal for that target localization, i.e., 
a nuclear localization signal. 
10 The lung cancer antibodies of the invention specifically bind to lung cancer proteins. 

By "specifically bind" herein is meant that the antibodies bind to the protein with a Kd of at 
least about 0.1 mM, more usually at least about 1 pM, preferably at least about 0.1 |iM or 
better, and most preferably, 0.01 \xM or better. Selectivity of binding to the specific target 
and not to related other sequences is also important. 

15 

Detection of lung cancer sequence for diagnostic and therapeutic applications 

In one aspect, the RNA expression levels of genes are determined for different 
cellular states in the lung cancer phenotype. Expression levels of genes in normal tissue (e.g., 
not undergoing lung cancer), in lung cancer tissue (and in some cases, for varying severities 

20 of lung cancer that relate to prognosis, as outlined below), or in non-malignant disease are 
evaluated to provide expression profiles. A gene expression profile of a particular cell state 
or point of development is essentially a "fingerprint" of the state of the cell. While two states 
may have a particular gene similarly expressed, the evaluation of a number of genes 
simultaneously allows the generation of a gene expression profile that is reflective of the state 

25 of the cell. By comparing expression profiles of cells in different states, information 

regarding which genes are important (including both up- and down-regulation of genes) in 
each of these states is obtained. Then, diagnosis may be performed or confirmed to 
determine whether a tissue sample has the gene expression profile of normal or cancerous 
tissue. This will provide for molecular diagnosis of related conditions. 

30 "Differential expression," or grammatical equivalents as used herein, refers to 

qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
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normal versus lung cancer tissue. Genes may be turned on or turned off in a particular state, 
relative to another state thus permitting comparison of two or more states. A qualitatively 
regulated gene will exhibit an expression pattern within a state or cell type which is 
detectable by standard techniques. Some genes will be expressed in one state or cell type, but 
5 not in both. Alternatively, the difference in expression may be quantitative, e.g., in that 
expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 
an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

10 GeneChip™ expression arrays, Lockhart (1996) Nature Biotechnology 14:1675-1680, hereby 
expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is typically 
at least about 50%, more preferably at least about 100%, more preferably at least about 

15 150%, more preferably at least about 200%, with from 300 to at least 1000% being especially 
preferred. 

Evaluation may be at the gene transcript or the protein level. The amount of gene 
expression may be monitored using nucleic acid probes to the RNA or DNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 

20 gene product itself (protein) can be monitored, e.g., with antibodies to the lung cancer protein 
and standard immunoassays (ELISAs, etc.) or other techniques, including mass spectroscopy 
assays, 2D gel electrophoresis assays, etc. Proteins corresponding to lung cancer genes, e.g., 
those identified as being important in a lung cancer or disease phenotype, can be evaluated in 
a lung cancer diagnostic test. In a preferred embodiment, gene expression monitoring is 

25 perfomied simultaneously on a number of genes. 

The lung cancer nucleic acid probes may be attached to biochips as outlined herein for 
the detection and quantification of lung cancer sequences in a particular cell. The assays are 
further described below in the example. PCR techniques can be used to provide greater 
sensitivity. Multiple protein expression monitoring can be performed as well. Similarly, 

30 these assays may be performed on an individual basis as well. 

In a preferred embodiment nucleic acids encoding the lung cancer protein are 
detected. Although DNA or RNA encoding the lung cancer protein may be detected, of 
particular interest are methods wherein an mRNA encoding a lung cancer protein is detected. 
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Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is complementary to 

and hybridizes with the mRNA and includes, but is not limited to, oligonucleotides, cDNA or 

RNA. Probes also should contain a detectable label, as defined herein. In one method the 

mRNA is detected after immobilizing the nucleic acid to be examined on a solid support such 

5 as nylon membranes and hybridizing the probe with the sample. Following washing to 

remove the non-specifically bound probe, the label is detected. In another method detection 

of the mRNA is performed in situ. In this method permeabilized cells or tissue samples are 

contacted with a detectably labeled nucleic acid probe for sufficient time to allow the probe 

to hybridize with the target mRNA. Following washing to remove the non-specifically bound 

10 probe, the label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that 
is complementary to the mRNA encoding a lung cancer protein is detected by binding the 
digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue 
tetrazolium and 5 -bromo-4-chloro -3 -indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins as 

15 described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The lung cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing lung cancer sequences are used in diagnostic assays. This can be performed on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 

20 techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, lung cancer proteins, including intracellular, 
transmembrane, or secreted proteins, find use as markers of lung cancer, e.g., for prognostic 
or diagnostic purposes. Detection of these proteins in putative lung cancer tissue allows for 

25 detection, prognosis, or diagnosis of lung cancer or similar disease, and perhaps for selection 
of therapeutic strategy. In one embodiment, antibodies are used to detect lung cancer 
proteins. A preferred method separates proteins from a sample by electrophoresis on a gel 
(typically a denaturing and reducing protein gel, but may be another type of gel, including 
isoelectric focusing gels and the like). Following separation of proteins, the lung cancer 

30 protein is detected, e.g., by immunoblotting with antibodies raised against the lung cancer 
protein. Methods of immunoblotting are well known to those of ordinary skill in the art. 

In another preferred method, antibodies to the lung cancer protein find use in in situ 
imaging techniques, e.g., in histology (e.g., Asai (ed. 1993) Methods in Cell Biology: 
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Antibodies in Cell Biology , volume 37. In this method cells are contacted with from one to 
many antibodies to the lung cancer protein(s). Following washing to remove non-specific 
antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 
the antibody is detected by incubating with a secondary antibody that contains a detectable 
5 label, e.g., multicolor fluorescence or confocal imaging. In another method the primary 

antibody to the lung cancer protein(s) contains a detectable label, e.g., an enzyme marker that 
can act on a substrate. In another preferred embodiment each one of multiple primary 
antibodies contains a distinct and detectable label. This method finds particular use in 
simultaneous screening for a plurality of lung cancer proteins. Many other histological 

10 imaging techniques are also provided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 
to detect and distinguish emissions of different wavelengths. In addition, a fluorescence 
activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing lung cancer from 

15 blood, serum, plasma, stool, and other samples. Such samples, therefore, are useful as 

samples to be probed or tested for the presence of lung cancer proteins. Antibodies can be 
used to detect a lung cancer protein by previously described immunoassay techniques 
including ELIS A, immunoblotting (western blotting), immunoprecipitation, BIACORE 
technology and the like. Conversely, the presence of antibodies may indicate an immune 

20 response against an endogenous lung cancer protein or vaccine. 

In a preferred embodiment, in situ hybridization of labeled lung cancer nucleic acid 
probes to tissue arrays is done. For example, arrays of tissue samples, including lung cancer 
tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) is then 
performed. When comparing the fingerprints between an individual and a standard, the 

25 skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is 
further understood that the genes which indicate the diagnosis may differ from those which 
indicate the prognosis and molecular profiling of the condition of the cells may lead to 
distinctions between responsive or refractory conditions or may be predictive of outcomes. 
In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

30 modified proteins and cells containing lung cancer sequences are used in prognosis assays. 
As above, gene expression profiles can be generated that correlate to lung cancer, clinical, 
pathological, or other information, in terms of long term prognosis. Again, this may be done 
on either a protein or gene level, with the use of genes being preferred. Single or multiple 
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genes may be useful in various combinations. As above, lung cancer probes may be attached 

to biochips for the detection and quantification of lung cancer sequences in a tissue or patient. 

The assays proceed as outlined above for diagnosis. PCR method may provide more 

sensitive and accurate quantification. 

5 

Assays for therapeutic compounds 

In a preferred embodiment, the proteins, nucleic acids, and antibodies as described 
herein are used in drug screening assays. The lung cancer proteins, antibodies, nucleic acids, 
modified proteins and cells containing lung cancer sequences are used in drug screening 

10 assays or by evaluating the effect of drug candidates on a "gene expression profile" or 

expression profile of polypeptides. In a preferred embodiment, the expression profiles are 
used, preferably in conjunction with high throughput screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
Zlokamik, et al. (1998) Science 279:84-8; Heid (1996) Genome Res. 6:986-94. 

15 In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

modified proteins and cells containing the native or modified lung cancer proteins are used in 
screening assays. That is, the present invention provides novel methods for screening for 
compositions which modulate the lung cancer phenotype or an identified physiological 
function of a lung cancer protein. As above, this can be done on an individual gene level or 

20 by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 

embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokarnik, supra. 

Having identified differentially expressed genes herein, a variety of assays may be 

25 performed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene with altered regulation in lung cancer, test 
compounds can be screened for the ability to modulate gene expression or for binding to the 
lung cancer protein. "Modulation" thus includes an increase or a decrease in gene 
expression. The preferred amount of modulation will depend on the original change of the 

30 gene expression in normal versus tissue undergoing lung cancer, with changes of at least 

10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000%o or 
greater. Thus, if a gene exhibits a 4-fold increase in lung cancer tissue compared to normal 
tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in lung 
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cancer tissue compared to normal tissue often provides a target value of a 10-fold increase in 

expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes and the 

quantification of gene expression levels, or, alternatively, the gene product itself can be 

5 monitored, e.g., through the use of antibodies to the lung cancer protein and standard 

immunoassays. Proteomics and separation techniques may also allow quantification of 

expression. 

In a preferred embodiment, gene or protein expression monitoring of a number of 
entities, i.e., an expression profile, is monitored simultaneously. Such profiles will typically 
1 0 involve a plurality of those entities described herein. 

In this embodiment, the lung cancer nucleic acid probes are attached to biochips as 
outlined herein for the detection and quantification of lung cancer sequences in a particular 
cell. Alternatively, PGR may be used. Thus, a series, e.g., of microtiter plate, may be used 
with dispensed primers in desired wells. A PGR reaction can then be performed and analyzed 
15 for each well. 

Expression monitoring can be performed to identify compounds that modify the 
expression of one or more lung cancer-associated sequences, e.g., a polynucleotide sequence 
set out in the tables. Generally, in a preferred embodiment, a test compound is added to the 
cells prior to analysis. Moreover, screens are also provided to identify agents that modulate 

20 lung cancer, modulate lung cancer proteins, bind to a lung cancer protein, or interfere with 
the binding of a lung cancer protein and an antibody, substrate, or other binding partner. 

The term "test compound" or "drug candidate" or "modulator" or grammatical 
equivalents as used herein describes a molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or. 

25 indirectly alter the lung cancer phenotype or the expression of a lung cancer sequence, e.g., a 
nucleic acid or protein sequence. In preferred embodiments, modulators alter expression 
profiles of nucleic acids or proteins provided herein. In one embodiment, the modulator 
suppresses a lung cancer phenotype, e.g., to a normal or non-malignant tissue fingerprint. In 
another embodiment, a modulator induces a lung cancer phenotype. Generally, a plurality of 

30 assay mixtures are run in parallel with different agent concentrations to obtain a differential 
' response to the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detection. 
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In one aspect, a modulator will neutralize the effect of a lung cancer protein. By 
"neutralize" is meant that activity of a protein and the consequent effect on the cell is 
inhibited or blocked. 

In certain embodiments, combinatorial libraries of potential modulators will be 
screened for an ability to bind to a lung cancer polypeptide or to modulate activity. 
Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve providing a 
library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
display a desired characteristic activity. The compounds thus identified can serve as 
conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical compounds 
generated by either chemical synthesis or biological synthesis by combining a number of 
chemical "building blocks" such as reagents. For example, a linear combinatorial chemical 
library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of chemical 
building blocks called amino acids in every possible way for a given compound length (i.e., 
the number of amino acids in a polypeptide compound). Millions of chemical compounds 
can be synthesized through such combinatorial mixing of chemical building blocks (Gallop, 
et al. (1994) J. Med. Chem. 37(9):1233-1251). 

Preparation and screening of combinatorial chemical libraries is well known to those 
of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Pept. Prot. Res. 37:487- 
493, Houghton, et al. (1991) Nature , 354:84-88), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat. Acad. Sci. USA 
90:6909-6913), vinylogous polypeptides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
1 14:6568), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding (Hirschmann, et 
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al. (1992) J. Amer. Chem. Soc. 114:9217-9218), analogous organic syntheses of small 

compound libraries (Chen, et al. (1994) J. Amer. Chem. Soc. 116:2661), oligocarbamates 

(Cho, et al. (1993) Science 261:1303), and/or peptidyl phosphonates (Campbell, et al. (1994) 

J. Org. Chem. 59:658). See, generally, Gordon, et al. (1994) J. Med. Chem. 37:1385, nucleic 

5 acid libraries (see, e.g., Stratagene, Corp.), peptide nucleic acid libraries (see, e.g., U.S. 

Patent 5,539,083), antibody libraries (see, e.g., Vaughn, et al. (1996) Nature Biotechnology 

14(3):309-314, and PCTYUS96/10287), carbohydrate libraries (see, e.g., Liang, et al. (1996) 

Science 274:1520-1522, and U.S. Patent No. 5,593,853), and small organic molecule libraries 

(see, e.g., benzodiazepines, Baum (1993) C&EN, Jan 18, page 33; isoprenoids, U.S. Patent 

10 No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, 
U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent No. 
5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). . 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, 

15 Woburn, MA, 433 A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems include automated workstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

20 systems utilizing robotic arms (Zymate II, Zymark Corporation, Hopkinton, Mass.; Orca, 

Hewlett-Packard, Palo Alto, Calif), which mimic the manual synthetic operations performed 
by a chemist. The above devices, with appropriate modification, are suitable for use with the 
present invention. In addition, numerous combinatorial libraries are themselves 
commercially available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, 

25 Inc., St. Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, etc.). 

The assays to identify modulators are amenable to high throughput screening. 
Preferred assays thus detect modulation of lung cancer gene transcription, polypeptide 
expression, and polypeptide activity. 

30 High throughput assays for evaluating the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
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U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 
binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 
throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available (see, e.g., 
5 Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 

Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
typically automate procedures, including sample and reagent pipetting, liquid dispensing, 
timed incubations, and final readings of the microplate in detector(s) appropriate for the 
assay. These configurable systems provide high throughput and rapid start up as well as a 

1 0 high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or 

15 fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or 
random or directed digests of proteinaceous cellular extracts, may be used. In this way 
libraries of proteins may be made for screening in the methods of the invention. Particularly 
preferred in this embodiment are libraries of bacterial, fungal, viral, and mammalian proteins, 
with the latter being preferred, and human proteins being especially preferred. Particularly 

20 useful test compound will be directed to the class of proteins to which the target belongs, e.g., 
substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 30 
amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 to 
about 15 being particularly preferred. The peptides may be digests of naturally occurring 

25 proteins, random peptides, or "biased" random peptides. By "randomized" or grammatical 
equivalents herein is meant that the nucleic acid or peptide consists of essentially random 
sequences of nucleotides and amino acids, respectively. Since these random peptides (or 
nucleic acids, discussed below) are often chemically synthesized, they may incorporate a 
nucleotide or amino acid at any position. The synthetic process can be designed to generate 

30 randomized proteins or nucleic acids, to allow the formation of all or most of the possible 

combinations over the length of the sequence, thus forming a library of randomized candidate 
bioactive proteinaceous agents. 
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In one embodiment, the library is fully randomized, with no sequence preferences or 
constants at any position. In a preferred embodiment, the library is biased. That is, some 
positions within the sequence are either held constant, or are selected from a limited number 
of possibilities. In a preferred embodiment, the nucleotides or amino acid residues are 
5 randomized within a defined class, e.g., of hydrophobic amino acids, hydrophilic residues, 
sterically biased (either small or large) residues, towards the creation of nucleic acid binding 
domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, 
threonines, tyrosines or histidines for phosphorylation sites, etc. 

Modulators of lung cancer can also be nucleic acids, as defined above. 

10 As described above generally for proteins, nucleic acid modulating agents may be 

naturally occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. 
Digests of procaryotic or eucaryotic genomes may be used as is outlined above for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in the literature. 

1 5 After a candidate agent has been added and the cells allowed to incubate for some 

period of time, the sample containing a target sequence is analyzed. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 
lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 
amplification such as PGR performed as appropriate. For example, an in vitro transcription 

20 with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an enzyme, such as, 

25 alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
substrate produces a product that can be detected. Alternatively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin 

30 is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

Nucleic acid assays can be direct hybridization assays or can comprise "sandwich 
assays", which include the use of multiple probes, as is generally outlined in U.S. Patent Nos. 
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5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 

5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of 

which are hereby incorporated by reference. In this embodiment, in general, the target nucleic 

acid is prepared as outlined above, and then added to the biochip comprising a plurality of 

5 nucleic acid probes, under conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, including 

high, moderate and low stringency conditions as outlined above. The assays are generally 

run under stringency conditions which allow formation of the label probe hybridization 

complex only in the presence of target. Stringency can be controlled by altering a step 

10 parameter that is a thermodynamic variable, including, but not limited to, temperature, 
formamide concentration, salt concentration, chaotropic salt concentration, pH, organic 
solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally 
outlined in U.S. Patent No. 5,681,697: Thus it may be desirable to perform certain steps at 

1 5 higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. Components 
of the reaction may be added simultaneously, or sequentially, in different orders, with 
preferred embodiments outlined below. In addition, the reaction may include a variety of 
other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc. 

20 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 
The assay data are analyzed to determine the expression levels, and changes in 

25 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the lung cancer phenotype. In one 
embodiment, screening is performed to identify modulators that can induce or suppress a 
particular expression profile, thus preferably generating the associated phenotype. In another 
embodiment, e.g., for diagnostic applications, having identified differentially expressed genes 

30 important in a particular state, screens can be performed to identify modulators that alter 

expression of individual genes. In an another embodiment, screening is performed to identify 
modulators that alter a biological function of the expression product of a differentially 
expressed gene. Again, having identified the importance of a gene in a particular state, 



61 



WO 02/086443 PCT/US02/12476 

screens are performed to identify agents that bind and/or modulate the biological activity of 
the gene product, or evaluate genetic polymorphisms. 

Genes can be screened for those that are induced in response to a candidate agent. 
After identifying a modulator based upon its ability to suppress a lung cancer expression 
5 pattern leading to a normal expression pattern, or to modulate a single lung cancer gene 
expression profile so as to mimic the expression of the gene from normal tissue, a screen as 
described above can be performed to identify genes that are specifically modulated in 
response to the agent. Comparing expression profiles between normal tissue and agent 
treated lung cancer tissue reveals genes that are not expressed in normal tissue or lung cancer 

10 tissue, but are expressed in agent treated tissue. These agent-specific sequences can be 
identified and used by methods described herein for lung cancer genes or proteins. In 
particular these sequences and the proteins they encode find use in marking or identifying 
agent treated cells. In addition, antibodies can be raised against the agent induced proteins 
and used to target novel therapeutics to the treated lung cancer tissue sample. 

15 Thus, in one embodiment, a test compound is administered to a population of lung 

cancer cells, that have an associated lung cancer expression profile. By "administration" or 
"contacting" herein is meant that the candidate agent is added to the cells in such a manner as 
to allow the agent to act upon the cell, whether by uptake and intracellular action, or by 
action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 

20 candidate agent (i.e., a peptide) may be put into a viral construct such as an adenoviral or 
retroviral construct, and added to the cell, such that expression of the peptide agent is 
accomplished, e.g., PCT US97/01019; Regulatable gene therapy systems can also be used. 

Once a test compound has been administered to the cells, the cells can be washed if 
desired and are allowed to incubate under preferably physiological conditions for some 

25 period of time. The cells are then harvested and a new gene expression profile is generated, 
as outlined herein. 

Thus, e.g., lung cancer or non-malignant tissue may be screened for agents that 
modulate, e.g., induce or suppress a lung cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on lung 
30 cancer activity. By defining such a signature for the lung cancer phenotype, screens for new 
drugs that alter the phenotype can be devised. With this approach, the drug target need not be 
known and need not be represented in the original expression screening platform, nor does 
the level of transcript for the target protein need to change. 
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Measure of lung cancer polypeptide activity, or of lung cancer or the lung cancer 

phenotype can be performed using a variety of assays. For example, the effects of the test 

compounds upon the function of the metastatic polypeptides can be measured by examining 

parameters described above. A suitable physiological change that affects activity can be used 

5 to assess the influence of a test compound on the polypeptides of this invention. When the 

functional consequences are determined using intact cells or animals, one can also measure a 

variety of effects such as, in the case of lung cancer associated with tumors, tumor growth, 

tumor metastasis, neovascularization, hormone release, transcriptional changes to both known 

and uncharacterized genetic markers (e.g., northern blots), changes in cell metabolism such as 

10 cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 
the assays of the invention, mammalian lung cancer polypeptide is typically used, e.g., 
mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in vitro. 
For example, a lung cancer polypeptide is first contacted with a potential modulator and 

15 incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, the 
lung cancer polypeptide levels are determined in vitro by measuring the level of protein or 
mRNA. The level of protein is typically measured using immunoassays such as western 
blotting, ELIS A and the like with an antibody that selectively binds to the lung cancer 
polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 

20 PGR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 

blotting, are preferred. The level of protein or mRNA is typically detected using directly or 
indirectly labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 
radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using a lung cancer protein 

25 promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 
CAT, or P-gal. The reporter construct is typically transfected into a cell. After treatment 
with a potential modulator, the amount of reporter gene transcription, translation, or activity 
is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on individual 

30 genes and gene products (proteins). That is, having identified a particular differentially 

expressed gene as important in a particular state, screening of modulators of the expression of 
the gene or the gene product itself can be done. The gene products of differentially expressed 
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genes are sometimes referred to herein as "lung cancer proteins." The lung cancer protein 

may be a fragment, or alternatively, be the full length protein to a fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes is 
performed. Typically, the expression of only one or a few genes are evaluated. In another 
5 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or isolated 
10 gene product is used; that is, the gene products of one or more differentially expressed 

nucleic acids are made. For example, antibodies are generated to the protein gene products, 
and standard immunoassays are run to determine the amount of protein present. Alternatively, 
cells comprising the lung cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a lung cancer 
15 protein and a candidate compound, and determining the binding of the compound to the lung 
cancer protein. Preferred embodiments utilize the human lung cancer protein, although other 
mammalian proteins may also be used, e.g., for the development of animal models of human 
disease. In some embodiments, as outlined herein, variant or derivative lung cancer proteins 
may be used. 

20 Generally, in a preferred embodiment of the methods herein, the lung cancer protein 

or the candidate agent is non-diffusably bound to an insoluble support, preferably having 
isolated sample receiving areas (e.g., a microtiter plate, an array, etc.). The insoluble 
supports may be made of a composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 

25 screening. The surface of such supports may be solid or porous and of a convenient shape. 
Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 

30 and samples. The particular manner of binding of the composition is typically not crucial so 
long as it is compatible with the reagents and overall methods of the invention, maintains the 
activity of the composition, and is nondiffusable. Preferred methods of binding include the 
use of antibodies (which do not sterically block either the ligand binding site or activation 
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sequence when the protein is bound to the support), direct binding to "sticky" or ionic 

supports, chemical crosslinking, the synthesis of the protein or agent on the surface, etc. 
Following binding of the protein or agent, excess unbound material is removed by washing. 
The sample receiving areas may then be blocked through incubation with bovine serum 
5 albumin (BSA), casein or other innocuous protein or other moiety. 

In a preferred embodiment, the lung cancer protein is bound to the support, and a test 
compound is added to the assay. Alternatively, the candidate agent is bound to the support 
and the lung cancer protein is added. Novel binding agents include specific antibodies, non- 
natural binding agents identified in screens of chemical libraries, peptide analogs, etc. Of 

10 particular interest are screening assays for agents that have a low toxicity for human cells. A 
wide variety of assays may be used for this purpose, including labeled in vitro protein-protein 
binding assays, electrophoretic mobility shift assays, immunoassays for protein binding, 
functional assays (phosphorylation assays, etc.) and the like. 

The determination of the binding of the test modulating compound to the lung cancer 

1 5 protein may be done in a number of ways. In a preferred embodiment, the compound is 

labeled, and binding determined directly, e.g., by attaching all or a portion of the lung cancer 
protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), washing 
off excess reagent, and determining whether the label is present on the solid support. Various 
blocking and washing steps may be utilized as appropriate. 

20 In some embodiments, only one of the components is labeled, e.g., the proteins (or 

proteinaceous candidate compounds) can be labeled. Alternatively, more than one 
component can be labeled with different labels, e.g., 125 I for the proteins and a fluorophor for 
the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

25 In one embodiment, the binding of the test compound is determined by competitive 

binding assay. The competitor may be a binding moiety known to bind to the target molecule 
(i.e., a lung cancer protein), such as an antibody, peptide, binding partner, ligand, etc. Under 
certain circumstances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compound. In one embodiment, the 

30 test compound is labeled. Either the compound, or the competitor, or both, is added first to 
the protein for a time sufficient to allow binding, if present. Incubations may be performed at 
a temperature which facilitates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to facilitate rapid high throughput screening. Typically 

65 



WO 02/086443 PCT/US02/12476 
between 0.1 and 1 hour will be sufficient. Excess reagent is generally removed or washed 

away. The second component is then added, and the presence or absence of the labeled 

component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by a test 

5 compound. Displacement of the competitor is an indication that the test compound is binding 

to the lung cancer protein and thus is capable of binding to, and potentially modulating, the 

activity of the lung cancer protein. In this embodiment, either component can be labeled. 

Thus, e.g., if the competitor is labeled, the presence of label in the wash solution indicates 

displacement by the agent. Alternatively, if the test compound is labeled, the presence of the 

10 label on the support indicates displacement. 

In an alternative embodiment, the test compound is added first, with incubation and 
washing, followed by the competitor. The absence of binding by the competitor may indicate 
that the test compound is bound to the lung cancer protein with a higher affinity. Thus, if the 
test compound is labeled, the presence of the label on the support, coupled with a lack of ► 

15 competitor binding, may indicate that the test compound is capable of binding to the lung 
cancer protein. 

In a preferred embodiment, the methods comprise differential screening to identity 
agents that are capable of modulating the activity of the lung cancer proteins. In one 
embodiment, the methods comprise combining a lung cancer protein and a competitor in a 

20 first sample. A second sample comprises a test compound, a lung cancer protein, and a 

competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the lung cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 

25 agent is capable of binding to the lung cancer protein. 

Alternatively, differential screening is used to identify drug candidates that bind to the 
native lung cancer protein, but cannot bind to modified lung cancer proteins. The structure of 
the lung cancer protein may be modeled, and used in rational drug design to synthesize agents 
that interact with that site. Drug candidates that affect the activity of a lung cancer protein 

30 are also identified by screening drugs for the ability to either enhance or reduce the activity of 
the protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
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Incubation of all samples is for a time sufficient for the binding of the agent to the protein. 

Following incubation, samples are washed free of non-specifically bound material and the 

amount of bound, generally labeled agent determined. For example, where a radiolabel is 

employed, the samples may be counted in a scintillation counter to determine the amount of 

5 bound compound. 

A variety of other reagents may be included in the screening assays. These include 

reagents like salts, neutral proteins, e.g., albumin, detergents, etc. which may be used to 

facilitate optimal protein-protein binding and/or reduce non-specific or background 

interactions. Also reagents that otherwise improve the efficiency of the assay, such as 

10 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 

of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 

compound capable of modulating the activity of a lung cancer protein. The methods 

comprise adding a test compound, as defined above, to a cell comprising lung cancer 

15 proteins. Preferred cell types include almost any cell. The cells contain a recombinant 

nucleic acid that encodes a lung cancer protein. In a preferred embodiment, a library of 

candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous or 

subsequent exposure of physiological signals, e.g., hormones, antibodies, peptides, antigens, 

20 cytokines, growth factors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogenics, or other cells (e.g., cell-cell contacts). In another 

example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate lung cancer agents are identified. Compounds 

with pharmacological activity are able to enhance or interfere with the activity of the lung 

25 cancer protein. Once identified, similar structures are evaluated to identify critical structural 

feature of the compound. 

In one embodiment, a method of inhibiting lung cancer cell division is provided. The 

method comprises administration of a lung cancer inhibitor. In another embodiment, a 

method of inhibiting lung cancer is provided. The method may comprise administration of a 

30 lung cancer inhibitor. In a further embodiment, methods of treating cells or individuals with 

lung cancer are provided, e.g., comprising administration of a lung cancer inhibitor. 

In one embodiment, a lung cancer inhibitor is an antibody as discussed above. In 

another embodiment, the lung cancer inhibitor is an antisense molecule. 
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A variety of cell growth, proliferation, viability, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 
5 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 

10 grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of lung cancer sequences, which when expressed in host cells, inhibit abnormal 
cellular proliferation and transformation. A therapeutic compound would reduce or eliminate 
the host cells' ability to grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft. 

15 Techniques for soft agar growth or colony formation in suspension assays are 

described in Freshney (1994) Culture of Animal Cells a Manual of Basic Technique (3 rd ed.), 
herein incorporated by reference. See also, the methods section of Garkavtsev, et al. (1996), 
supra, herein incorporated by reference. 

20 Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 

25 higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 

30 normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a preferred 
method of measuring density limitation of growth. Transformed host cells are transfected 
with a lung cancer-associated sequence and are grown for 24 hours at saturation density in 
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non-limiting medium conditions. The percentage of cells labeling with ( 3 H)-thymidine is 

determined autoradiographically. See, Freshney (1994), supra. 



Growth factor or serum dependence 

Transformed cells typically have a lower serum dependence than their normal 
counterparts (see, e.g., Temin (1966) J. Natl. Cancer Insti. 37:167-175; Eagle, et al. (1970) L 
Exp. Med. 131:836-879); Freshney, supra. This is in part due to release of various growth 
factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 



Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 

15 Gullino, "Angiogenesis, tumor vascularization, and potential interference with tumor growth" 
in Mihich (ed. 1985) Biological Responses in Cancer , pp. 178-184). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) "Angiogenesis and Cancer" in Sem Cancer Biol.) . 
Various techniques which measure the release of these factors are described in 

20 Freshney (1994), supra. Also, see, Unkeless, et al. (1974) J. Biol. Chem. 249:4295-4305; 

Strickland and Beers (1976) J. Biol. Chem . 251:5694-5702; Whur, et al. (1980) Br. J. Cancer 
42:305-312; Gullino, "Angiogenesis, tumor vascularization, and potential interference with 
tumor growth" in Mihich (ed. 1985) Biological Responses in Cancer , pp. 178-184; Freshney 
Anticancer Res. 5:111-130 (1985). 

25 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel or some other extracellular matrix 
constituent can be used as an assay to identify compounds that modulate lung cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
30 invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 



69 



WO 02/086443 PCT/US02/12476 
Techniques described in Freshney (1994), supra, can be used. Briefly, the level of 

invasion of host cells can be measured by using filters coated with Matrigel or some other 

extracellular matrix constituent. Penetration into the gel, or through to the distal side of the 

filter, is rated as invasiveness, and rated histologically by number of cells and distance 

5 moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of 

the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 



Tumor growth in vivo 

Effects of lung cancer-associated sequences on cell growth can be tested in transgenic 

10 or immune-suppressed mice. Knock-out transgenic mice can be made, in which the lung 
cancer gene is disrupted or in which a lung cancer gene is inserted. Knock-out transgenic 
mice can be made by insertion of a marker gene or other heterologous gene into the 
endogenous lung cancer gene site in the mouse genome via homologous recombination. 
Such mice can also be made by substituting the endogenous lung cancer gene with a mutated 

15 version of the lung cancer gene, or by mutating the endogenous lung cancer gene, e.g., by 
exposure to carcinogens. 

A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 
containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 

20 that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 
lesion (see, e.g., Capecchi, et al. (1989) Science 244:1288). Chimeric targeted mice can be 
derived according to Hogan, et al. (1988) Manipulating the Mouse Embrvo: A Laboratory 
Manual . Cold Spring Harbor Laboratory and Robertson (ed. 1987) Teratocarcinomas and 

25 Embryonic Stem Cells: A Practical Approach . , IRL Press, Washington, D.C. 

Alternatively, various immune-suppressed or immune-deficient host animals can be 
used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella, et al. (1974) J. 
Natl. Cancer Inst. 52:921), a SCID mouse, a thymectomized mouse, or an irradiated mouse 
(see, e.g., Bradley, et al. (1978) Br. J. Cancer 38:263; Selby, et al. (1980) Br. J. Cancer 41:52) 

30 can be used as a host. Transplantable tumor cells (typically about 10 6 cells) injected into 

isogenic hosts will produce invasive tumors in a high proportions of cases, while normal cells 
of similar origin will not. In hosts which developed invasive tumors, cells expressing a lung 
cancer-associated sequences are injected subcutaneously. After a suitable length of time, 
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preferably 4-8 weeks, tumor growth is measured (e.g., by volume or by its two largest 

dimensions) and compared to the control. Tumors that have statistically significant reduction 

(using, e.g., Student's T test) are said to have inhibited growth. 



5 Polynucleotide modulators of lung cancer 

Antisense and RNAi Polynucleotides 

In certain embodiments, the activity of a lung cancer-associated protein is 
downregulated, or entirely inhibited, by the use of antisense or an inhibitory polynucleotide, 
i.e., a nucleic acid complementary to, and which can preferably hybridize specifically to, a 

10 coding mRNA nucleic acid sequence, e.g., a lung cancer protein mRNA, or a subsequence 
thereof. Binding of the antisense polynucleotide to the mRNA reduces the translation and/or 
stability of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occurring nucleotides, or synthetic species formed from naturally-occurring subunits or their 

1 5 close homologs. Antisense polynucleotides may also have altered sugar moieties or inter- 
sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing 
species which are known for use in the art. Analogs are comprehended by this invention so 
long as they function effectively to hybridize with the lung cancer protein mRNA. See, e.g., 
Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

20 Such antisense polynucleotides can readily be synthesized using recombinant means, 

or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 
Antisense molecules as used herein include antisense or sense oligonucleotides. 

25 Sense oligonucleotides can, e.g., be employed to block transcription by binding to the anti- 
sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA 
(antisense) sequences for lung cancer molecules. A preferred antisense molecule is for a lung 
cancer sequence in the tables, or for a ligand or activator thereof. Antisense or sense 

30 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
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is described in, e.g., Stein and Cohen (1988) Cancer Res. 48:2659 and van der Krol, et al. 

(1988) BioTechniaues 6:958). 

RNA interference is a mechanism to suppress gene expression in a sequence specific 

manner. See, e.g., Brumelkamp, et al. (2002^ Sciencexpress (21March2002); Sharp (1999) 

Genes Dev. 13:139-141; and Cathew (2001) Curr. Op. Cell Biol. 13:244-248. In mammalian 

cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 

be effective at inducing an RNAi response. See, e.g., Elbashir, et al. (2001) Nature 411:494- 

498. The mechanism may be used to downregulate expression levels of identified genes, e.g., 

treatment of or validation of relevance to disease. 



Ribozymes 

In addition to antisense polynucleotides, ribozymes can be used to target and inhibit 
transcription of lung cancer-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 

15 been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. in Pharmacology 
25: 289-317 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 18:299-304; European Patent Publication No. 0 360 257; U.S. Patent No. 

20 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., WO 
94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapy 1:39-45; Leavitt, et al. (1995) Proc. Natl. Acad. Sci. USA 
92:699-703; Leavitt, et al. (19994) Human Gene Therapy 5:1 151-120; and Yamada, et al. 
(1994) Virology 205: 121-126). 

25 Polynucleotide modulators of lung cancer may be introduced into a cell containing the 

target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other ligands that bind to cell 
surface receptors. Preferably, conjugation of the ligand binding molecule does not 

30 substantially interfere with the ability of the ligand binding molecule to bind to its 

corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of lung 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
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formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating lung cancer in cells or organisms 
5 are provided. In one embodiment, the methods comprise administering to a cell an anti-lung 
cancer antibody that reduces or eliminates the biological activity of an endogenous lung 
cancer protein. Alternatively, the methods comprise administering to a cell or organism a 
recombinant nucleic acid encoding a lung cancer protein. This may be accomplished in any 
number of ways. In a preferred embodiment, e.g., when the lung cancer sequence is down- 

10 regulated in lung cancer, such state may be reversed by increasing the amount of lung cancer 
gene product in the cell. This can be accomplished, e.g., by overexpressing the endogenous 
lung cancer gene or administering a gene encoding the lung cancer sequence, using known 
gene-therapy techniques. In a preferred embodiment, the gene therapy techniques include the 
incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 

15 as described in PCT/US93/03868, hereby incorporated by reference in its entirety. 

Alternatively, e.g., when the lung cancer sequence is up-regulated in lung cancer, the activity 
of the endogenous lung cancer gene is decreased, e.g., by the administration of a lung cancer 
antisense or RNAi nucleic acid. 

In one embodiment, the lung cancer proteins of the present invention may be used to 

20 generate polyclonal and monoclonal antibodies to lung cancer proteins. Similarly, the lung 
cancer proteins can be coupled, using standard technology, to affinity chromatography 
columns. These columns may then be used to purify lung cancer antibodies useful for 
production, diagnostic, or therapeutic purposes. In a preferred embodiment, the antibodies 
are generated to epitopes unique to a lung cancer protein; that is, the antibodies show little or 

25 no cross-reactivity to other proteins. The lung cancer antibodies may be coupled to standard 
affinity chromatography columns and used to purify lung cancer proteins. The antibodies 
may also be used as blocking polypeptides, as outlined above, since they will specifically 
bind to the lung cancer protein. 

30 Methods of identifying variant lung cancer-associated sequences 

Without being bound by theory, expression of various lung cancer sequences is 
correlated with lung cancer. Accordingly, disorders based on mutant or variant lung cancer 
genes may be determined. In one embodiment, the invention provides methods for 
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identifying cells containing variant lung cancer genes, e.g., determining all or part of the 
sequence of at least one endogenous lung cancer genes in a cell. In a preferred embodiment, 
the invention provides methods of identifying the lung cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one lung cancer gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 
of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced lung cancer gene to a known lung cancer gene, i.e., 
a wild-type gene. 

The sequence of all or part of the lung cancer gene can then be compared to the 
sequence of a known lung cancer gene to determine if any differences exist. This can be 
done using known homology programs, such as Bestfit, etc. In a preferred embodiment, the 
presence of a difference in the sequence between the lung cancer gene of the patient and the 
known lung cancer gene correlates with a disease state or a propensity for a disease state, as 
outlined herein. 

In a preferred embodiment, the lung cancer genes are used as probes to determine the 
number of copies of the lung cancer gene in the genome. 

In another preferred embodiment, the lung cancer genes are used as probes to 
determine the chromosomal localization of the lung cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the lung 
cancer gene locus. 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a lung cancer protein or 
modulator thereof, is administered to a patient. By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Drug 
Delivery : Lieberman, Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 
082476918X, 0824712692, 0824716981; Lloyd (1999) The Art, Science and Technology of 
Pharmaceutical Compounding ; and Pickar (1999) Dosage Calculations^ Adjustments for 
lung cancer degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, sex, diet, time of administration, 
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drug interaction and the severity of the condition may be necessary, and will be ascertainable 

with routine experimentation by those skilled in the art. 

A "patient" for the purposes of the present invention includes both humans and other 

animals, particularly mammals. Thus the methods are applicable to both human therapy and 

5 veterinary applications. In the preferred embodiment the patient is a mammal, preferably a 

primate, and in the most preferred embodiment the patient is human. 

The administration of the lung cancer proteins and modulators thereof of the present 

invention can be done in a variety of ways, including, but not limited to, orally, 

subcutaneously, intravenously, intranasally, transdermally, intraperitoneally, intramuscularly, 

10 intrapulmonary, vaginally, rectally, or intraocularly. In some instances, e.g., in the treatment 
of wounds and inflammation, the lung cancer proteins and modulators may be directly 
applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a lung cancer 
protein in a form suitable for administration to a patient. In the preferred embodiment, the 

15 pharmaceutical compositions are in a water soluble form, such as being present as 

pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 

20 sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 

25 such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 
potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 

30 ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the following: 
carrier proteins such as serum albumin; buffers; fillers such as microcrystalline cellulose, 
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lactose, corn and other starches; binding agents; sweeteners and other flavoring agents; 

coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit dosage 

forms depending upon the method of administration. For example, unit dosage forms 

5 suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 

and lozenges. It is recognized that lung cancer protein modulators (e.g., antibodies, antisense 

constructs, ribozymes, small organic molecules, etc.) when administered orally, should be 

protected from digestion. This is typically accomplished either by complexing the 

molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 

10 packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a lung cancer protein 
modulator dissolved in a pharmaceutically acceptable carrier, preferably an aqueous carrier. 
A variety of aqueous carriers can be used, e.g., buffered saline and the like. These solutions 

15 are sterile and generally free of undesirable matter. These compositions may be sterilized by 
conventional, well known sterilization techniques. The compositions may contain 
pharmaceutically acceptable auxiliary substances as required to approximate physiological 
conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, 
e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate 

20 and the like. The concentration of active agent in these formulations can vary widely, and 
will be selected primarily based on fluid volumes, viscosities, body weight and the like in 
accordance with the particular mode of administration selected and the patient's needs (e.g., 
Remington's Pharmaceutical Science (15th ed., 1980) and Hardman, et al. (eds. 1996) 
Goodman and Gilman: The Pharmacologial Basis of Therapeutics) . 

25 Thus, a typical pharmaceutical composition for intravenous administration would be 

about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 

30 administrate compositions will be known or apparent to those skilled in the art, e.g., 

Remington's Pharmaceutical Science and Goodman and Gilman, The Pharmacologial Basis 
of Therapeutics , supra. 
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The compositions containing modulators of lung cancer proteins can be administered 
for therapeutic or prophylactic treatments. In therapeutic applications, compositions are 
administered to a patient suffering from a disease (e.g., a cancer) in an amount sufficient to 
cure or at least partially arrest the disease and its complications. An amount adequate to 
5 accomplish this is defined as a "therapeutically effective dose." Amounts effective for this 
use will depend upon the severity of the disease and the general state of the patient's health. 
Single or multiple administrations of the compositions may be administered depending on the 
dosage and frequency as required and tolerated by the patient. In any event, the composition 
should provide a sufficient quantity of the agents of this invention to effectively treat the 

10 patient. An amount of modulator that is capable of preventing or slowing the development of 
cancer in a mammal is referred to as a "prophylactically effective dose." The particular dose 
required for a prophylactic treatment will depend upon the medical condition and history of 
the mammal, the particular cancer being prevented, as well as other factors such as age, 
weight, gender, administration route, efficiency, etc. Such prophylactic treatments may be 

1 5 used, e.g., in a mammal who has previously had cancer to prevent a recurrence of the cancer, 
or in a mammal who is suspected of having a significant likelihood of developing cancer 
based, at least in part, upon gene expression profiles. Vaccine strategies may be used, in 
either a DNA vaccine form, or protein vaccine. 

It will be appreciated that the present lung cancer protein-modulating compounds can 

20 be administered alone or in combination with additional lung cancer modulating compounds 
or with other therapeutic agent, e.g., other anti-cancer agents or treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in the tables, such as antisense or RNAi 
polynucleotides or ribozymes, will be introduced into cells, in vitro or in vivo. The present 

25 invention provides methods, reagents, vectors, and cells useful for expression of lung cancer- 
associated polypeptides and nucleic acids using in vitro (cell-free), ex vivo, or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 

30 introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and other well known methods for introducing cloned genomic 
DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., 
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Berger and Kinimel, Guide to Molecular Cloning Techniques, Methods in Enzvmology 

volume 152 (Berger), Ausubel, et al. (eds. 1999) Current Protocols (supplemented through 

1999), and Sambrook, et al. (1989) Molecular Cloning - A Laboratory Manual (2nd ed., Vol. 

1-3). 

5 In a preferred embodiment, lung cancer proteins and modulators are administered as 

therapeutic agents, and can be formulated as outlined above. Similarly, lung cancer genes 
(including both the full-length sequence, partial sequences, or regulatory sequences of the 
lung cancer coding regions) can be administered in a gene therapy application. These lung 
cancer genes can include antisense or inhibitory applications, e.g., as inhibitory RNA or gene 

10 therapy (e.g., for incorporation into the genome) or as antisense compositions. 

Lung cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 
can include, e.g., lipidated peptides (see, e.g.,Vitiello, et al. (1995) J. Clin. Invest. 95:341), 
peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") microspheres 

15 (see, e.g., Eldridge, et al. (1991) Molec. Immunol. 28:287-294; Alonso, et al. (1994) Vaccine 
12:299-306; Jones, et al. (1995) Vaccine 13:675-681), peptide compositions contained in 
immune stimulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) Nature 
344:873-875; Hu, et al. (1998) Clin Exp Immunol. 113:235-243), multiple antigen peptide 
systems (MAPs) (see, e.g., Tarn (1988) Proc. Natl. Acad. Sci. U.S.A. 85:5409-5413; Tarn 

20 (1996) J. Immunol. Methods 196: 17-32), peptides formulated as multivalent peptides; 

peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al., p. 379 In: Kaufmann (ed. 1996) Concepts in vaccine development ; 
Chakrabarti, et al. (1986) Nature 320:535; Hu, et al. (1986) Nature 320:537; Kieny, et al. 
(1986) AIDS Bio/Technology 4:790; Top, et al. (1971) J. Infect. Pis. 124:148; Chanda, et al. 

25 (1990) Virology 175:535), particles of viral or synthetic origin (see, e.g., Kofler, et al. (1996) 
J. Immunol. Methods 192:25; Eldridge, et al. (1993) Sem. Hematol. 30:16; Falo, et al. (1995) 
Nature Med. 7:649), adjuvants (Warren, et al. (1986) Annu. Rev. Immunol. 4:369; Gupta, et 
al. (1993) Vaccine 11:293), liposomes (Reddy, et al. (1992) J. Immunol. 148:1585; Rock 
(1996) Immunol. Today 17:131), or, naked or particle absorbed cDNA (Ulmer, et al. (1993) 

30 Science 259:1745; Robinson, et al. (1993) Vaccine 1 1:957; Shiver, et al., p. 423 In: 

Kaufmann (ed. 1996) Concepts in vaccine development : Cease and Berzofsky (1994) Annu. 
Rev. Immunol. 12:923 and Eldridge, et al. (1993) Sem. Hematol. 30:16). Toxin-targeted 
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delivery technologies, also known as receptor mediated targeting, such as those of Avant 

Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a substance 

designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or 

5 mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 

Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 

as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 

MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 

Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

10 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 

tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 

polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 

Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 

used as adjuvants. 

15 Vaccines can be administered as nucleic acid compositions wherein DNA or RNA 

encoding one or more of the polypeptides, or a fragment thereof, is administered to a patient. 
This approach is described, for instance, in Wolff, et. al. (1990) Science 247:1465 as well as 
U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; WO 
98/04720; and in more detail below. Examples of DNA-based delivery technologies include 

20 "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, cationic lipid 
complexes, and particle-mediated ("gene gun") or pressure-mediated delivery (see, e.g., U.S. 
Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of expression vectors include 

25 attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences that encode lung cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 

30 Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 
described in Stover, et al. (1991) Nature 351 :456~460. A wide variety of other vectors useful 
for therapeutic administration or immunization e.g., adeno and adeno-associated virus 
vectors, retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 
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like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata, et 
al. qOQCn Mol Med Today 6:66-71: Shedlock, et al. (2000) J. Leukoc. Biol. 68:793-806; 
Hipp, et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 
5 lung cancer gene or portion of a lung cancer gene under the control of a regulatable promoter 
or a tissue-specific promoter for expression in a lung cancer patient. The lung cancer gene 
used for DNA vaccines can encode full-length lung cancer proteins, but more preferably 
encodes portions of the lung cancer proteins including peptides derived from the lung cancer 
protein. In one embodiment, a patient is immunized with a DNA vaccine comprising a 

10 plurality of nucleotide sequences derived from a lung cancer gene. For example, lung cancer- 
associated genes or sequence encoding subfragments of a lung cancer protein are introduced 
into expression vectors and tested for their immunogenicity in the context of Class I MHC 
and an ability to generate cytotoxic T cell responses. This procedure provides for production 
of cytotoxic T cell responses against cells which present antigen, including intracellular 

15 epitopes. 

In a preferred embodiment, DNA vaccines include a gene encoding an adjuvant 
molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
the immunogenic response to the lung cancer polypeptide encoded by the DNA vaccine. 
Additional or alternative adjuvants are available. 

20 In another preferred embodiment lung cancer genes find use in generating animal 

models of lung cancer. When the lung cancer gene identified is repressed or diminished in 
metastatic tissue, gene therapy technology, e.g., wherein antisense or inhibitory RNA directed 
to the lung cancer gene will also diminish or repress expression of the gene. Animal models 
of lung cancer find use in screening for modulators of a lung cancer-associated sequence or 

25 modulators of lung cancer. Similarly, transgenic animal technology including gene knockout 
technology, e.g., as a result of homologous recombination with an appropriate gene targeting 
vector, will result in the absence or increased expression of the lung cancer protein. When 
desired, tissue-specific expression or knockout of the lung cancer protein may be necessary. 
It is also possible that the lung cancer protein is overexpressed in lung cancer. As 

30 such, transgenic animals can be generated that overexpress the lung cancer protein. 

Depending on the desired expression level, promoters of various strengths can be employed 
to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of the expression level of the transgene. 
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Animals generated by such methods will find use as animal models of lung cancer and are 
additionally useful in screening for modulators to treat lung cancer. 



Kits for Use in Diagnostic and/or Prognostic Applications 

For use in diagnostic, research, and therapeutic applications suggested above, kits are 
also provided by the invention. In diagnostic and research applications such kits may include 
at least one of the following: assay reagents, buffers, lung cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, RNAi, 
dominant negative lung cancer polypeptides or polynucleotides, small molecule inhibitors of 
lung cancer-associated sequences, etc. A therapeutic product may include sterile saline or 
another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing instructions (e.g., 
protocols) for the practice of the methods of this invention. While the instructional materials 
typically comprise written or printed materials they are not limited to such. A medium 
capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not limited to electronic storage media (e.g., 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 

The present invention also provides for kits for screening for modulators of lung 
cancer-associated sequences. Such kits can be prepared from readily available materials and 
reagents. For example, such kits can comprise one or more of the following materials: a lung 
cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing 
lung cancer-associated activity. Optionally, the kit contains biologically active lung cancer 
protein. A wide variety of kits and components can be prepared according to the present 
invention, depending upon the intended user of the kit and the particular needs of the user. 
Diagnosis would typically involve evaluation of a plurality of genes or products. The genes 
typically will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 
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Example 1 : Gene Chip Analysis 

Molecular profiles of various normal and cancerous tissues were determined and 
5 analyzed using gene chips. RNA was isolated and gene chip analysis was performed as 

described (Glynne, et al. (2000) Nature 403:672-676; Zhao, et al. (2000) Genes Dev. 14:981- 
993). 
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Tables 1 A and 1 B were previously filed on April 1 8, 2001 in USSN 60/284,770 (1 8501 -001 500US) and on November 29, 2001 in USSN 60/334,370 



10 
15 

20 
25 

30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



(18501-001520US) 










Table 1A 












Pkey 


ExAccn 


UnigenelD 


Unigeno I itis 


7n% phrnn/Qfl% Ml 

l U/0 CIIIUII/3U/0 INL 


70% ROAn/Qf)% Nl 


100134 


D13264 


Hs.49 


rnacrupiidge scaveityer recepiur i 


1,61 


0.74 


100780 


HG3731-HT4001 


"""Immitnnnlnhi ilin Hphw P.hain Vriirp Rpn 
ii in iiuiivjy luuumi noavy wiiaiiii vujio r\cy 


2.68 


3.28 


100971 


J02874 


Hs.83213 


Tally dUlU DIIIUHiy piUlclll H, duipuuyio 


1.96 


0.14 


101088 


L05568 


Hs.553 


cntnlo narrior famlti/ fi fnonrpilrnnciniHo 
oUIUlc Oal llci idiiniy u ^iicuiuiiaiioii iiuc 


0,79 


0.07 


101102 


L07594 


Hs.79059 


f ronofprminn nrn\wfh fopfpr" hofa rppanfn 

udiibiuiiniiiy Ljiuwui iduiu/i ucid icoepiu 


2.55 


■j 


101168 


L15388 


Hs.211569 


f~l nrntptn_nniinlpH rpponfpr kinacp R 
O pi ULCII l-LUUpicU ICUupiUI Mllaoc J 


0.88 


0.27 


101277 


L38486 


Hs. 118223 


tYiIprvrfihrilldr-accnpiafoH nrnfpin 4 

1 IIILiI UIIUI Mldl-dooUOIdlCU fJIULCIIl *t 


0.89 


0.26 


101330 


L43821 


Hs.80261 


pnhannpr nf filampnfflHnn 1 taas-likp rin 


0,59 


0.29 


101336 


L49169 


Hs.75678 


FR 1 miirinp nctpncarpnma viral ftnrnnpnp h 


t15 


0.41 


101345 


L76380 


Hs.152175 


pslritnnin rpppnrnr-likp 


0.81 


0.31 


101678 


M62505 


Hs.2161 


rrimnlpmpnt rnmnnnpnt ^ rpppntnr 1 {C^n I 


1.31 


0.77 


101764 


M80563 


Hs.81256 


Qlflfl palpitim-hlnHinn nmtpin A4 fpalpinm 
o ocuwiuiii uiiiumy nt ^oiuiuiu 


1.44 


0.82 


101771 


M81750 


Hs. 153837 


mv/olnM poll niiploar riiffprpntrallnn ant 

1 1 lyclUlU l»Cll IIUOIBdl UlllCICIIUallUll dill 


0.96 


0.45 


101842 


M93221 


Hs.75182 


ITIdJIIlUdO icLcpiUl i v> lypo 1 


1.27 


0.37 


102283 


U31384 


Hs.83381 


guanins nucleotide binding protein 1 1 


1.04 


0.3 


102363 


U39447 


Hs.1 98241 


amine oxiaase, copper containing o \vasc 


u.ao 


n 9R 


102507 


U52154 


Hs.1 93044 


potassium inwaiuiy-recuiyiny cndiinei, 5 


£.,Q I 


v>.HO 


102698 


U75272 


Hs.1867 


pruyasincsin ^pepbinuyen \j) 


0.95 


0.23 


103025 


X54131 


Hs.1 23641 


proiem lyrusiriB piiuopnaidse, leccpiur i 


1.62 


0.21 


103280 


X79981 


Hs.76206 


cad hs rin 5j VE-cadherin (vascular epithe 


0.9 


n 41 


103496 


Y09267 


Hs.1 32821 


flavin containing monooxygonasG 2 


1 27 


n 4Q 


103541 


211697 , 


Hs.79197 


PnQQ anllnon /a/*tiwaf or! D lwmnhpn\/l£ic* i 

uuoo antigen ^acuvdieo d lynipiiucyiea, i 


1.86 




103554 


Z18951 


Hs.74034 


rf"»awor\tin 1 ■ r-Qi/onho nmloln' 99Ir n 

caveoiiit i t cdvtjuidc proiem, zzmj 


1.27 


0.47 


104212 


ABQ02298 


Hs.1 73035 


KIAAmnn nrntpin 


1.17 


0.16 


104691 


AA011176 


Hs.37744 


FQTc 


1.08 


0.35 


104825 


M035613 


Hs.141883 


ESTs 


0.75 


0.27 


104857 


AA043219 


Hs.19058 


FQTc 
Co I a 


2.6 


3.3 


104865 


M045136 


Hs.22575 


CQTe 

to IS 


1 93 


n 4Q 


104989 


AA102098 


Hs.1 18615 


FQTc 
CD J 5 


w.uo 


0,32 


105729 


AA292694 


Hs.3807 


FQTc* Woalflu clmilar tn PHfl^PHOI FMMAM PR 
CO I o, VVeaKiy olllllldl 10 r nVJOrrllJLulVllvlAIN rr\ 


0.86 


0.34 


105847 


AA398606 


Hs.32241 


ESTs 


1.32 


0.4 


105894 


M400979 


Hs.25691 


calcitonin recepiur-iirxe reLiepioi duiivi 


0.78 


0.28 


106490 


AA451861 


Hs.1 15537 


co i s, vveaKiy similar io uipepnudse prec 


1 9 


0.47 


106536 


AA453997 


Hs.23804 


CO 1 b 


0.82 


0.15 


106605 


AA457718 


Hs.21103 


Primp, canionc rriRMA- pnWA nKF7n^fi4Rfl7fi /fr 

nomo sapiens mmvA, cuinm ui\r£.poo t tDUf o \\\ 


n qq 


n 07 


106667 


AA461086 


Hs.1 6578 


Co I s 


1 17 

t.i/ 


0.4 


106773 


AA478109 


Hs.1 88833 


F<?Tc 
CO 1 5 


1.46 


0.43 


106797 


M478962 


Hs.1 69943 


Co IS 


1 1R 


0.32 


106844 


AA485055 


Hs.1 5821 3 


Speilil dooUCIdlcU dIUiycll D 


0.98 


0.51 


106870 


AA487576 


Hs.26530 


carum rlanrtxioHnn roennnco /rihncnhallrli/1 

Scrum uepnvdiiuii icopuiioc ipiiuopiidiiuyi 


1.05 


0.14 


106954 


AA496980 


Hs.204038 


ESTs 


1.25 


0.33 


107054 


AA600150 


Hs.14366 


ESTs 


1.11 


0.4 


107292 


T30407 


Hs.4789 


CO 1 «i VVydWy olllllldl IU UAIUdll Vc-MI coo 


1.07 


2.58 


107994 


AA036811 


Hs.1 65030 


CO IS 


n 7 


0,21 


107997 


AA037388 


Hs.82223 


Human HMA coniipnpp frnm plnnp 14.1 nn p 

null Idl I UINM OClJUelll/C HUM 1 UIUI Ic l*r 1 n9 Ull U 


1.02 


0.48 


108041 


AA041552 


Hs.61957 


Co 1 S 


1.44 


0.51 


108087 


AA045709 


Hs.40545 


FQTc 
CO 1 b 


1.98 




108382 


AAQ74885 


Hs.67726 


macropnaye recepior wiin coiidyenuub sir 






108435 


M078787 


Hs.194101 


FQTc 
CO 1 S 


2.53 


1.53 


108480 


M081093 


Hs.68055 


FQTc 
Co 1 S 


1.56 


0.48 


109252 


AA194830 


Hs.85944 


FQTc 

Co 1 S , 


9 fiQ 


3.18 


109550 


F01534 


Hs.26981 


FQTc 
Co 1 S 


1 1Q 
I . I tJ 


0.65 


109613 


F03031 


Hs.27519 


CQTc 
CO 1 S 


1 m 


0.29 


109837 


H00656 


Hs.29792 


ESTs 


0.81 


0.15 


109893 


H04768 


Hs.30484 


FQTc 
CO 1 S 


1.44 


0.32 


109984 


H09594 


Hs.1 0299 


FQTc 
Co 1 5 


0.62 


0.14 


110099 


H16568 


Hs.23748 


CQTe 
Co 1 5 


1 m 

I ,U 1 


0.28 


110837 


N30796 


Hs.17424 


Co I s, vveatsiy simiidr io semdpiiunn r [n. 


1 1 
i . i 


0.22 


1 1 1247 


m9825 


Hs.16762 


Mnmn canipnc mRMA- pRWA nKF7n^R4R9nfi9 (1 
norno sapiens mruvn, cu/vm ui\r £.pQOHu£.i>o£. n 


1.26 


0.26 


111341 


N80935 


Hs.22483 


FQTc 
CO 1 5 


1.57 


0.52 


111510 


R07856 


Hs.16355 


FQTc 
Co 1 S 


0.30 


\ 


1 1 1737 


R25410 


Hs.9218 


FQTc 
CO 1 5 


0.97 


0.24 


113195 


T57112 




yc^iuy i i.s i oiraiayene lung (fftjjf c. iu} 


1 99 




113238 


T62979 


Hs.189813 


CCTe 

to IS 


9 97 


ft 4R 


113540 


T90496 


Hs.1 6757 


CCTe 
CO IS 


1 OR 
1 .UO 


0 99 


113552 


T90889 


Hs.1 6026 


CO IS 


1 1R 


n 49 


113606 


T93093 


Hs.17125 


FQTc 
CO 1 5 


1.48 


0.7 


113695 


T96965 


Hs.17948 


CCTe 
Co IS 


1.0*1 


0.28 


113946 


W84753 


Hs.37896 


ESTs 


1.79 


0.72 


114251 


239898 


Hs.21948 


ESTs 


195 


0.25 


114359 


Z41589 


Hs.153483 


ESTs; Moderately similar to H1 chloride 


1.42 


0.13 


115230 


AA278300 


Hs.1 82980 


ESTs 


2.62 


0.42 


115279 


AA279760 


Hs.63671 


ESTs 


1.79 


0.91 


115566 


AA398083 


Hs.43977 


ESTs 


0.86 


0.2 


115965 


M446661 


Hs.1 73233 


ESTs 


0.79 


0.04 


116166 


AA461556 


Hs.202949 


KIAA1102 protein 


2.29 


0.68 


116279 


M486073 


Hs.57362 


ESTs 


2.27 


0.78 ' 


117023 


H88157 


Hs.41105 


ESTs 


1.36 


0.16 



83 



WO 02/086443 



PCT/US02/12476 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



80 



85 



117209 


H99959 


riS.4z/bo 


a a ann-f 


N9UY ry 


nS.y444b 


4 A QOCH 

11 0301 


N9383y 


Hs.os/oo 


nyu/3 


D'JOQQ/l 

K3zBy4 


HS.4bbl4 


nyzzi 


K981U0 




A A riOO A 

\ i 98z4 


l»f-7il roe 

VW4b3b 


|J_ A QA 

HS.184 


•« A QQC1 

liyobl 


\MQ(\-tA c 

W8Uno 




lzuu41 


vv9z7 to 


ns.byjoo 


1 Oi"H QO 


Z38839 


Hs.izbuiy 


1 20467 


AAzblb/y 


HS.18fbZ8 


1Z1314 


AA4UZ/yy 


nS.lbzboo 


121643 


A A /M 70.70 

AA41 /U/8 


MS.193/b/ 


A ia cr\r\ 

121690 


AA418074 


li_ a a nooc 

Hs.nuz8b 


122633 


AA4D4U8U 


HS.34Bb3 


A OOG7Q 

123978 


C20653 


MS. 170^78 


124214 


H58608 


HS.151323 


124357 


N22401 




124438 


K\A ^~\^ QQ 

N40188 


HS.lUzb5U 


1 251 67 


W4bbbU 


HS.lU^b41 


Izbl /4 


\AfC1 QOR 

Wbl83b 


HS.^31U8Z 


1 25422 


a aoaqooo 
AA9U3zz9 


HS.l oof 1 / 


125561 


Al/M 7CC7 

AI41 f oof 


HS.Zjiyf 8 


125831 


U0U988 




1 97AAO 
IzvUUz 


DOCOQA 


Up 0/1 Q7Q 

MS.z4yf y 


1970A7 

lz/3U/ 


A A0RQ0R7 

AAjoyoo/ 


MS. 1 ZO/ IZ 


Iz/bUy 


A AC90RRQ 

AAbzzboy 


U« 4 CAH D 

HS.lbUolo 


127959 


A1QAO/71 

AI3UZ4/1 


nS.lz4zyz 


128458 


D52193 


HS.bb34U 


4 r iQG f )A 

12ob^4 


A A/l7Q9ftQ 

AA4/yzUy 


Up 4MfiA7 


19Q7QO 

l/o/ 89 


A AV1QCRQ7 

AA48b0b/ 


Up 1ARRQE 

ns.iubbyb 


128798 


ACA1 XQCQ 

Ar014yo8 


Up •iacoiq 

ns.lUbydb 


A OQOCO 


R51076 


Up 4A79fff 

nS. lu/3bi 


129057 


X624ob 


Up 04 A7AO 

HS.zl4/4z 


1 2921 0 


A A Af\* RRA 

AA4ulbo4 


Up OAOQ/O 


10QO/4A 

lz9z4U 


u/ovioftA 
Wz4Jbu 


Up 00.7QGQ 


129402 


T63781 




129565 


V77777 
Kfttff 


Up 1QQ70C 

Hs.iyo/^b 


•I OOCQO 

1Z9090 


A A/4Q7A1 R 
AA40/U10 


Up 0004/1 

nS.yool4 


19QR0R 

izybzo 


A A/1/17/1 A 
AA44f 4 I U 


Uc 1 171 0 
MS. I l / I Z 


•1 90RQQ 

iz9oyy 


A A/1CQR7Q 

AA4bBb/B 


Up 10A17 

liS.lzUl f 


129898 


N48oyo 


Up 1 lOCZG 

HS.13zbb 


129958 


LzUbyi 


Up 4 07Q 
MS.13/0 


iono70 
13UZY3 


Uby914 


Up 4KQQCQ 

nS.lboobo 


130655 


MnOOQ/1 

N92934 


Up 4~fAr\Q 

HS.l /4uy 


130657 


T94452 


Up OA4 cni 

Hs. 201 591 


131061 


N64328 


HS.22567 


131066 


F09006 


Up OOCDQ 

Hs.22588 


131263 


R38334 


Hs. 24950 


131589 


U52100 


Up ICM ClA 

Hs.29191 


4 O 4COC 

131b8b 


A A A C7/t <")0 

AAlb/4^8 


Up 0ACD7 

HS.3Ub8/ 


131751 


LH QQOR 

HI 833b 


Up 01 CCO 

HS.Jlbbz 


132430 


1 zoboU 


Up 

HS.zb8b/b 


100/17G 

13z4/b 


MG71Q9 

Nb/iyz 


Up AOATR 

HS.4y4/b 


1Q9Q0R 

13zBob 


irnQRR7 
rliybbf 


Up C7QOQ 

ns.b/yzy 


Idol^U 


Ab4bby 


Up RKAOA 

HS.bb4z4 


IQQyIQQ 


n/iR07fi 
U4boYU 


Up 7 A 1 0A 
HS.f 41 ZU 


133565 


H57056 


Up OA/1Q01 

HS.ZU4831 


133651 


uy/iuo 


Up 1700Q1 


133835 


A A nnO/1 DQ 

AAUby489 


Up 7RRA(\ 

HS./bb4U 


1339/8 


W/38by 


Up 7QAC4 

nS./8Ubl 


1 OODQC 

ijjyoo 


L34657 


Up 7Q4AR 

nS.folAo 


134299 


AA48/bbB 


Hs.8135 


i O/Iqaa 
1343UU 


U8iy84 


Up 1CCAQ0 

HS.lbbUbz 


134323 


A A AOQQ7C 

AA02897O 


Hs.8175 


134343 


UbUb83 


Up DOAOO 

HS.BzUzB 


134417 


D879b9 


Up QOQH 

HS.82921 


134561 


i ugao*. 
U/o4^7 


Up DCOAO 

nS.obJuz 


134624 


VA/C71 i17 

vvb/14/ 


Up Q70A 


134090 


Hoo3b4 


Up QQR1 


134749 


Liuybb 


Up QQAQCZ 

HS.oy4ob 


10>l7Qft 

134/80 


LUbi oy 


Up QOft/in 

HS,Byb4U 


10/QRQ 

1340 oy 


1 ODZOQ 


Up QA/101 
HS.yU4zl 


135346 


IvlzlUbb 


Up oao 

Hs.yyz 


lAni 10 
1UU113 


D00591 


Hs.84746 


1UU14/ 


D13666 


Hs.136348 


4 AHOQA 

lUUZBU 


D42085 


Hs.155314 


100335 


D63391 


Hs.6793 


10036O 


D78335 


Hs.75939 


1AH079 

iuuo/ z 


D79997 


Hs.1 84339 


100486 


HG1112-HT1112 


100559 


HG2197-HT2267 


100576 


HG2290-HT2386 


100668 


HG2981-HT3938 


100906 


HG4716-HT5158 


100930 


HG721-HT4827 





ESTs 
ESTs 
ESTs 

v-ets avian erythroblastosis virus E26 o 
"""ytfOglLsl Soares fetal liver spleen 
advanced glycosylation end product-speci 
ESTs; Moderately similar to !!!! ALU SUB 
ESTs 

ESTs; Highly similar to KIAA0886 protein 

ESTs 

ESTs 

ESTs 

ESTs 

inhibitor of DNA binding 4; dominant neg 

ESTs 

ESTs 

" B,, yw37g07.s1 Morton Fetal Cochlea Homo 

ESTs 

ESTs 

EST 

ESTs 

ESTs 

"'"•HUM145B09B Clontech human fetal brain 
ESTs 

ESTs; Weakly similar to plL2 hypothetica 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

chemokine {C-C motif) receptor-like 2 

ESTs; Highly similar to Rap2 interacting 

CDW52 antigen (CAMPATH-1 antigen) 

KIAA1102 protein 

interleuk/n 7 receptor 

" n "yc21g01.s1 Stratagene lung (#937210) 

vasoactive intestinal peptide receptor 1 

Homo sapiens mRNA; cDNA DKFZp586L0120 (f 

ESTs; Weakly similar to !!!! ALU SUBFAMI 

KIAA0439 protein; homolog of yeast ubiqu 

ESTs 

annexin A3 

MAD {mothers against decapentaplegic; Dr 
cysteine-rich protein 1 (intestinal) 
ESTs 

ESTs; Moderately similar to HYPOTHETICAL 
ESTs 

regulator of G-protein signalling 5 
epithelial membrane protein 2 
Grb-2-associated binder 2 
ESTs 
EST 

Homo sapiens clone TUA8 Cri-du-chat regi 
slit (Drosophila) homolog 3 
tetranectin (plasminogen-binding protein 
adipose specific 2 
ESTs 

dihydropyrimidinase-like 2 

ESTs; Highly similar to RGC-32 [R.norveg 

transcription factor 21 

platelet/endothelial cell adhesion molec 

ESTs 

endothelial PAS domain protein I 

Homo sapiens mRNA; cDNA DKFZp564M0763 (f 

transforming growth factor; beta recepto 

solute carrier family 35 (CMP-stalic aci 

adenosine deaminase; RNA-specific; B1 (h 

deleted in liver cancer 1 

ESTs 

carbonic anhydrase IV 
TEK tyrosine kinase; endothelial (venous 
ESTs; Moderately similar to !!!! ALU SUB 
phosphatase A2; group IB (pancreas) 
Chromosome condensation 1 
Homo sapiens mRNA for osteoblast specifi 
KIAA0095 gene product 
platelet-activating factor acetylhydrola 
Uridine monophosphate kinase 
KIAA0175 gene product 
TIGR: ras-like protein TC4 
"collagen, type VII, alpha 1" 
"calcitonin/alpha-CGRP, alt. transcript 
"TIGR: CD44 (epican, alt. transcript 12 
Guanosine 5-Monophosphate Synthase 
"TIGR: placental protein 14, endometrial 



1 AR 

1.4b 


n 4r 

U.4B 


1 C1 


4 

1 


1 0/ 

1.34 


n aq 

U.4B 


1 4A 
1.14 


A 07 
U.Z7 


4 09 
l.OZ 


n RO 
U.OO 


A 

1 


n iq 
u. iy 


1 QO 
l.OO 


n ar 


I.Zo 


n rr 
u.oo 


A Q1 

u.yi 


v.Of 


1 R7 

l.or 


1 Q1 

i.y i 


1 0 
1.0 


n oi 

U.Ol 


0 01 
Z.OI 


n rr 
u.oo 


1 A7 


A R1 

u.bi 


1 01 

l.ol 


A RO 

u.bo 


1 RO 

l.bz 


n 09 
u.oz 


ft QO 


A OR 

u.ob 


1 OQ 

i.zy 


1 


1 Oft 

l.ob 


n 7 


1 A(\ 
1. 40 


n rq 
u.oy 


0 A7 

o.vf 


0 7R 
0.(0 


1 04 
1.04 


n o 
u.o 


1 QQ 

i.ay 


A RO 

u.bo 


A QA 

u.y4 


fl OR 
U.00 


o no 

o.UZ 


A OR 
H.UO 


I.U 1 


n rq 
u.oy 


1 91 
I.Z 1 


n 09 

u.oz 


0 R 

z.o 


1 
I 


1 10 
1. 10 


n oo 
u.oo 


1 AR 


n rr 
u.oo 


1 1 
1. 1 


n o4 

U.04 


1 1fi 
1. 10 


n rr 
u.oo 


0 A/J 
Z.U4 


9 A 
z.4 


1 77 
l.f f 


A 70 

U./o 


4 11 

1,11 


A OR 

u.ob 


A Q1 

u.yi 


A Ai 
U.4I 


1 OR 
l.OO 


n 40 


A R7 

u.b/ 


A AR 

u.uo 


1 0 
1.0 


A AO 

U.HZ 


1 9R 
I.ZO 


n ar 


1 *;a 

l.OO 


1 
1 


I. to 


n ro 
u.oo 


A R1 
U.B1 


A 01 
U.O I 


A 


A 09 

u.zz 


1 AA 
1.44 


A 7R 

u./b 


A Oft 

u.yb 


A AD 
U.4z 


1 R4 

l.bl 


A AR 
U.4b 


A Q7 

u.yr 


A 07 
U.O/ 


O OJI 

z.34 


O QO 

Z.Bz 


1 0 

l.z 


A RO 

U.bz 


A Oft 

u.yb 


A OQ 
U.OB 


1 A7 


A RO 

u.oz 


1 RR 

l.Bb 


0 AQ 

z.uy 


1 70 
l.f 0 


A RR 
U.OO 


A Q1 

u.y i 


A OQ 

u.zy 


A RO 
U.DZ 


n 9 
u.z 


1 OQ 
I.Z? 


n 4R 

U.nO 


0 OK 
Z.ZO 


A R7 
U.Of 


1 RR. 
l.OO 


A R9 
U/DZ 


1 1R 
1. 10 


fl 04. 
U.Of 


A 7Q 

u. / y 


n 97 

U.Zr 


A QQ 

u.yy 


n 9R 
u.zo 


1 no 

I.UZ 


fl 4R 
U.HO 


A RR 

u.oo 


A AO 
U.4Z 


1 1Q 

i.iy 


A 07 
U.Z/ 


1 01 

l.zl 


A R7 
U.O/ 


1 OR 
l.ZB 


A 
1 


0 10 
Z. (z 


A RR 
U.OO 


0 OR 
Z.00 


9 74 
Z. f*t 


1 OR 
l.OO 


n oo 
u.oo 


f) RQ 

u.oy 


f) 0 

u.z 


u.**o 


A 01 
U.Z 1 


9 14. 
Z. IH 


9 R4 
Z.OH 


n ro 
u.oo 


fi 10 
U. 10 


1 
I 


9 1R 
Z. IO 


u.b 


9 
Z 


1 09 
l.UZ 


1 OQ 

i .oy 


4 

1 


R RR 
0.00 


A Q1 

u.yi 


0 A>l 
Z\U4 


n 7R 

U. f O 


9 

z.uo 


1.09 


1.93 


0.97 


3.6 


1 


1 


0.85 


1.9 


1.18 


2.29 


1 


1.45 



84 



WO 02/086443 



PCT/US02/12476 



10 
15 

20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



mnaRO 
luuaou 


10019/I 
JUU1Z4 


u c 11779Q 


L-orofi n A A /or\!rtarmr\tuc!o Hultnco cimnlQ 

Kcidiin if ^cpiuctuiuiysio uuuusd oiiii^jit; 


0.84 


2.6 


1010*31 


ioco70 
JUoU/U 


He 1R17^R 
MS. IO 1 1 00 


[viBinx mciaiioproiuindoc y i^gmduiidac 


0.77 


1.52 


101111 


L08424 


Up 1R1Q 


AchsGts-scuts comptsx (Drosophila) homol 


1 


1 


1U11Z4 


I in*j/i*3 


MS. I IZO^l 


w Dmlonco mfiiKi^r *3 elfin WorinoH /Q{^A 

rroicdse inniuiiur o, sKHi-aerivcu 


0.62 


2.67 


ioi 1TE 
lull /o 


1 4 QQOfl 

LlooZU 


MS.ooyou 


"Mslsnoma sntiysn, fsmily A, 2" 


*j 


1 


4f\4')r\A 

1U1ZU4 


Lz4ZUd 


Up Q99*37 


Atsxia-tGlanyisctssis group D- associated 


0.74 


4.1 


101/1*31 
I U 1 40 1 


M1QRRR 

M lyooo 


MS. lU/O 


OITIdll piUJIIlc-l lUll piUlclil ID ^LtUlllMIIIJ 


0.85 


2.51 


IU 1440 


M91*3RQ 

mz i joy 


MS. lyooou 


L>t5roHn R /aniriormnl\/eic Hullnc2 cimnloY 
KcfdUIl O ^epiUci II lUiyolo UUIIUSd olllipiCA 


0.61 


8.83 


mi*;ii 

IU 10 1 I 


M97R9R 
MZ/0ZO 


Wc 9fi7^1Q 
MS.zor o iy 


Cn^nnonfine rotrrtuirul nrnfoaco 
CllUuyUllUUb fcLIUVIIdl piUlcdbt; 


1.03 


1.13 


101E9R 
IU IOZO 


M<:yo*+U 


MS.z/uo/y 


r'arrinnprnhrunnir antinPn.rplatpH rpll flri 
vjdHjil luui i iui yui iil. ai tiiyci i-i cioicu ucn au 


1.07 


4.61 


1 U 1 040 


M*31*39R 
Mo IOZO 


He 71 RAO 
MS./ lO*fZ 


"f^nanino m tr-loritirlp WnHlnn nrrdpin ICX n 


0.97 


1.13 


101R9R 
IU IOZO 


MR79Q*3 
MO/ zyo 




"Human nara^hv/rnirt hnrmnnp.rplafprl npnti 
nun laii pdidiiiyiuiu iiuiiinjiic iciaLcu pepu 


\ 


1 


101R4Q 

iu lu^y 


mrooa7 


He IROn 
ns. i oyu 


Honarin-hinHtnn nrnuirlh faotnr hinHinn nr 
ilcpdllil UlllUlliy yiuwui lauiui uuiuniy yi 




2.7 


10179A 

i u i / z^ 




He R90, 


hiillnne npmnhinniri anfinpn 1 f?^fl/?40kD^ 


1 


8.98 


10174R 
i u i / *«j 


M7R4R9 
M/O'HOZ 


He 1 Q9R 
ms. i y/o 


npemnntpln *}. /npmnhinile unlnarie antifipn 
Uctii i luyicn i o ^ctiipinyuo vuiyaiio anuycii 


1 


2.78 


iui /oy 


MoUZ44 


He *IR4Rni 
MS.1O40U1 


ooiuic cdrneriaiiiiiy / ^diiuiiiu diiiinu 


1.07 


2.45 


101 q(\a 
1U10U4 


Mobbyy 


He IRQfl/ln 

ms. i oyo^u 


T VIS nrnloin l/inoco 

i 1 1\ proicin KindoB 




1 


lUlOUO 


Moo/0/ 


He 1 1 94HR 

ns. J J z*f uo 


Q10H r»falpinm_Kinrlinn nrrifoin A7 /nenriac 


0.74 


1.76 


1 f\4 QOQ 

luiouy 


Mbbo4y 




MOmO SapictlS COIIIIcaIII ZO \\JJO£.) lllr\INr\ t U 




7 


1U1O40 


Myo4Zo 


He 7RRR7 
MS./ OOD/ 


r roicin lyrosine piiuopiididoy, icucpiui- 


1 


1 


lUlool 


mcmoeo 
My4ZbU 


MS.0ZU40 


K)!THl/tna /not iri nr/Mi/lh-i^rrtmAlinn fa^lor 

iviiOKine ^ricunie gr owin-promuuiiy idt»iur 


1.13 


2.6 




1 H 0*39*3 

UlUozo 


Up 7R1 17 
MS./ O I I / 


irneneuKin ennanccr uniainy iduiui l x 


1.03 


1.61 


102154 


U 17760 


Up 7KR1 7 

ns. fOO if 


n i nminin koto *3 tn'ma'in /19Rl/D\ Vol tn i 

Laiuinin, oeia o (nicein ( izokuj, ndttiu 


0.94 


3.62 


10010*3 

luziyo 


1 1007GQ 

UzU/Oo 


He *31*3 
MS.O lo 


sscreisu pnospnupiuiciii i ^ubicupuiiuii, 


0.34 


4.59 


lUZoUo 




Up onri7Q 
Ms.yuu/ o 


chromosome segregation 1 (ysast homoloy) 


1.45 


2.97 


1uzo4o 


U37519 


Up D7C3Q 

nS.o/boy 


Aldehyde dehydrogenase 8 


0.52 


2.25 


ioo*;ri 


ub i i*fO 


He 779R.fi 
M5. / / ZOO 


Pnhancpr nf 7Pelp /nrn^nnhila^ hnmolnn 9 

ClilldllUCI Ul tcolc \U\ Uoupi \\\a) liwinuiuy c 


0.91 


2.46 


lUZOlU 


UbbUH 


He *Jfl743 
MS.OU/*+0 


r reier eiuidiiy expicobcu diuiycu 111 iiicta 


*j 


3.88 


io*5R9*3 
lUZOZO 


UbbUoo 


He ^711fl 


Mcldiiuilid dliuycii, i di i my rt, o ^ivi/A\3t:-3j 


1 


1* 


luzooy 


1 17-1 or»7 
U/ 1ZU/ 


Up 9Q97Q 

Ms.zyz/y 


eyes dDbem i^urobupniid/ 1 luiiiuiuy c 


•j 


1 


1O0GQR 

luzoyo 


1 l7v1R'1 9 

U/4blZ 


He 9*50 

Ms.zoy 


rorKiiedu dox ivi i 


1.06 


2.77 




1 ICHR1 Q 

uyibio 


He RDQR9 

MS.ouyoz 


[xeuiuicnsin 


*| 


1* 


lUzooo 


X04741 


Up 7C11ft 


Ubiquitin carboxyl-terminal esterase L1 


1.13 


2.59 


1O0Q1 *3 


AU/byb 


He Rfl^49 
MS.0U04Z 


keratin 15 


0.7 


472 


1 moi c 

luzyio 


aU/ozU 


Up 99KQ 

nS.ZZOo 


Jviainx (vieiaiioproieiiidbe iu ^oiruinuiyoin 


1.15 


3.35 


lr\OQR*3 

luzyoo 


Aiby4o 


He *370Rft 
MS.O/UOo 


n /^olr>tfnnln/r>olr»ifnn!n rplalorl nnl\/nont! 

uaiciionin/caii/iioniii-icidieu puiypupu 


1 


1 


lOdUZl 


X53587 


Up QR9RR 
MS.oOZOb 


"Integnn, beta 4" 


1.38 


2.34 


inoo*3C 
lUouoD 


Ao4yzo 


Up G91RQ 
nS.OO I oy 


KA^friv nr^fitallrnirrtf^aco 1 /in^orctltiQl P 

iviaLnx meiaiiopruLcdbu i uiiLciSiiiiai t> 


*] 


14.93 


io*30rr 


AO/ OHO 


He 1RA<i1ft 

Mo. 1 OtO 1 U 


Oil dUIIII 


1.25 


4.17 


lUJUbU 


X57766 


Up 1 RK99/I 

HS.lbboz4 


mainx meianoproieinabe 1 1 ^siroineiyoin 


-j 


1.72 


10*J1 1 0 


Abdbzy 


Ue 9R77 
MS.ZOf / 


OdUDcllll 3, r-UdUllolJll ^pidbcllldi; 


1.16 


7.38 


in*3ioR 




Up 77*5R7 
MS./ /JO/ 


monoKine inuuceu uy yanuua inierjerun 


0J1 


1.48 


10*39/9 
1UJZ4Z 


Afbd4Z 


He *3RQ 

MS.ooy 


"Alfnhnl Hahurlrnnonaco 7 /place l\0 mil 

MiiiOiioi uenyuruymidbc / ^udbb iv^, niu 


1 


\ 


IUjo 1 Z 


ADilbyo 


He *31RR 
MS.O I OO 


LyriipnuGyic diuiycu u ouinpicA, iuuuo u, 


0.92 


1.28 


io*3./l7Q 

IU04/O 




He 1RQQ1 

MS.ooyy j 


Q100 f»aif>inm hinrlinn nmlpin A9 
O IUU walulUJTi -UJJUJJIiy piULclll r\£. 


1.05 


5.81 


io*3ccq 
1UO00O 


z.iyo/4 


He 97RR 
MS.Z/OO 


l/ornlin 17 

Keraun i / 


0.65 


6.68 


1 o*3i;7R 
IUOO/0 


79RQ17 


He 9R^1 
MS. ZOO I 


riaemnnloin 9 

uebiuuyiciii £. 


0,79 


173 


103587 


Z29083 , 


II. Q019Q 

ms.oZIZo 


b 1 4 uncoieiai anugen 


i 
i 


3.93 


103594 


Z31560 


Up QIC 

hs.olb 


N ODV / privy rlA^csrminlnn paninn hnv 9 n 

oKT vsex aeiermining region tj-dox z, p 


0.71 


7.23 


lUoYOO 






co i s, Migniy similar to intcyrdi mssnura 


0.99 


1.8 


104158 


A A AfZAQf\Q 

AA4o4yUo 


Up 0197 

MS.olZ/ 


r\f aaui 44 gene proauci 


0.96 


1.29 


104558 


Kbbb f o 


Up SSQRQ 

Ms.ooyoy 


Unm^n PiKI A ponnon^Q frn m ^fnno Qf37KI91 rtn 

Muman uin/\ sequence rrom cione yo/ inz i on 


1.23 


7,23 


lU4ooy 


AAUlUbbb 




to I S 


0.96 


2*11 


104733 


A AfH QjI QD 

AAUiy4yo 


Up 99A71 

MS.ZoU/1 


to IS 


1.18 


1.88 


in/foots 

iU4yuo 


AAUobouy 


He 9RR09 
MS.ZOOUZ 


rroiein Kinase aomaiits oonidiiiiny piuici 


1.11 


3.15 


10/1Q7Q 


AAUoo4bo 


He 10^99 

ms. lyozz 


PQTc* Wpnl/lu eimilnr tn till AI 1 1 Ql IRFAMI 
to 1 b, VVcdKly Simildl IU Wll MLU OUDrMIVII 


1.64 


2.89 


105012 


AA1 1 bUob 


He 0*390 

MS.yozy 


Momo sapiens mrviNM ior usooo, uunipiciB 


1.19 


3^91 


10K17K 
TUOl #0 


AAl0b0U4 


Ue 9R7 A(\ 
MS.ZO/**U 


to i s, wedwy simiidi iu unknuwn [cueiev 


0.9 


4.63 


incocQ 

lUOZDO 


A A107Q9R 


Up RRR9 
MS.000Z 


PQTc 
to 15 


0.95 


2.87 


10K9QR 

luozyo 


A A9QQ/ieQ 


U c 9RQRQ 

Ms.zoooy 


PQTe 

CO 1 b 


*] 


1,13 


in,E*31 0 
lUOOlZ 




Up 9*3*3/10 
MS.Z0040 


o-pnase Kinase-assouidieo proiciii c 


1.32 


3.01 


mG71Q 


A A9Q1 RyM 

AA^yib44 


Ue *3R7Q*3 

MS.oo/yo 


Mypoineucdi protein tlozoioo 


1.28 


2.31 


IUO/ HO 




u Q qcqa 
ms. yoyo 


ESTs 


1 


1 


ioroi o 


A A>H 1R91 
/V\4I 10/ J 


He RRQR 
MS.ooyo 


PQTe* esmp ae RFHR9 
CO I bdmc do DrnO f 


0.94 


2.04 


infi9*31 

lUoZol 


A Ay19QK7't 

AA4/yon 


Ue *3nnn9 

M5.O0UUZ 


r\i aa iooo proiein 


1.04 


1.5 


1UO04U 


AA40*JDU/ 


He *3R11il 
MS. 00 I l*l 


Hvnnthpfiral nrnfpin Fl 111100 

nypoineucai proicin tlj i i juu 


1.26 


2.26 


106575 


AA40buoy 


Ue 10R/191 
MS. IUO*fZ I 


CO I S 


*| 


2 


10Rfi*39 


A A/1KOQQ7 

AA4oyoy/ 


Uc 1 1 QRO 

ms. 1 1 you 


f~2DI "snphnrorl mQfac{ao!c_accor"iafpH nrnfp 

•ori-ancnoreu meidbidbib-dbbUL.idLcu ijiulo 


0,87 


1.32 


106727 


AA4bOJ4/ 


Ue "ZAriAZ 
nS.04U4o 


U«/nrkfha(inol nrnfain PI I907RA 

Mypoinencai proiein rLd^u/ot 


0.87 


1.59 


iorqor 


A A/On997 

A>\4yuzo/ 


Ue 99909/1 
MS. ZZZUZ4 


Trraneprintinn fanfnr RMAI 9 /p\/p|p-iikp f 

1 ransenpuon laciur diviml^ ^i/yuio-iii\c i 


0.61 


16 


4 070KQ 

nu/uoy 


AA0U0040 


Up OWAA 
MS.ZOU^H 


DAHR1 /Q pprouSeiaP^ hnmnlnn mil Rp 
t\r\uo i ^o. cerevibidcj iiuniuiuy ^c- uuji r\© 


0.48 


2,67 


10710/1 

IU/ 1U4 


A ARHQ7RR 
AAOUy/OO 


Ue 1 ROA'i 
MS. I0ZHO 


Mi ipIooI ap nrrtfoin 1 H90fern 

iMUCiouiai proiein i ^i^ukuj 


1.01 


1.44 


1071 G1 


A AR911RQ 
AADZ 1 1 oy 


He RRR7 
MS.OOO/ 


FQTe* nrnnnllanpn l_M nrntpina^P 
CO 1 5, prOUOIIdycIl l-N piUlcllldoc 


0.97 


2.89 


1079Q/I 


o/4L/jy 


He 901004 

ns.zy iyu*f 


Appoccrtn/ nrofoinc RAP*317RAP9Q 


1.15 


3.65 


4 mom 


AAfl9R^'IO 
AAUZb41 O 


Up Q1 K*3Q 

ns.yiooy 


CO I s 


0 79 
u. t c 


3.44 


107099 

lu/yzz 


MAU/OUZO 


Up R*Mfin 

MS.O IhOU 


In ciinprfamlli/ rprootnr I MIR nrpnircnr 
ig supci idiniiy icucpiui limiia picv>uioui 


1 


2.48 


iu/ yoz 


AAfl9Q^17 


He 1 RR7R 
MS. 1 00/ 0 


HunnihpHrat nrntain Fl 191 690 
nypuiuciiudi piuioin rwtiotu 


1 


1 


iorrq^ 
tuooyo 


MM I L \0 \D 


He 7nR9*3 
MS. / UO£0 


KIAA1077 nrnfpin 

IMnnlUi / piUlclll 


0.91 


3.53 


1 UOO J / 




He. fi?1ftn 


ESTs 


1 


1 


108860 


AA133334 


Hs.1 29911 


ESTs 


0.73 


7.3 


108990 


M1 52296 


Hs.72045 


ESTs 


1 


1 


109166 


M179845 


Hs.73625 


"RAB6 interacting, kinesin-like (rabkine 


1 


4.55 


109424 


AA227919 


Hs.85962 


Hyaluronan synthase 3 


1 


1.28 


109665 


F05012 


Hs.27027 


Hypothetical protein DKFZp762H1311 


1.42 


2 


109970 


H09281 


Hs.13234 


ESTs 


1.13 


2.16 



85 



WO 02/086443 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



110015 


M1DQQR 
n ivjyyo 


Hs.7164 


A rii^infpnrin snri mptallnnrntpina^p Hnma 


0.84 


1.95 


11015R 

1 IU 1 00 


H1RQ57 
n i oyo/ 


H«5 491^ 


ESTs 


0.94 


1.41 


1105R1 
1 1 U00 1 


H5QR17 

noyo i i 


Mc 51 QQ 

nb.O 1 33 


HSPH150 nrnfpin elmilar in iihinnitin-rnn 


0.91 


3.18 


11199*3 

1 1 IZZO 


MRRQ91 
Nooyz i 


He 34RQR 


ESTs i Wsskly simiisr to nsoQGnin [H.sspi 


0.91 


3.13 


111345 


NRQR90 
noyozu 


Hq 1455Q 


Hypothetical protein FLJ 10540 


1 


1.25 


1 1 1R7R 
I I lOf 0 


Koozoy 


He 9Q^94fi 
n&.zyozHo 


"FRTe Wpflklv/ eimrlartn nntstiv/p nlfiO \ 
cui o, vvccifxiy annual iu puiauvc piou[ 


0.83 


1.27 


111902 


R3Q1Q1 
r\oy i y i 


Uq 1 0Q445 


KIAA1 020 protein 


0.91 


0.91 


1 1 9944 


R5110Q 

r\o iouy 


U Q 70R93 
nb,/ uozo 


KIAA1077 nrotpin 


0.77 


3.01 


1 1 2973 


T17271 




"cDNA FLJ 13308 fis, clone OVARC1001436, 


1 


1 


11 9QRQ 


T934R9 

1 Z040Z 


He RQQR1 


"niarvlnlvrpml kinaep 7Pta M 04kDV 


0.55 


1.03 


1 1 3047 


T25867 


He 754Q 


ESTs 


0.87 


2 


11Q0Q5 
i iouyo 


T40920 


He. 1 9R733 

Flo. 1 


ESTs 


1 


1 


1 1 3531 


T90345 


Hs.16740 


Hypothetical protein FLJ 11 036 


0.42 


1.44 


113970 


WRR74R 

WOO/ HO 


Hs 8109 


ESTs 


1.17 


1.73 


114346 


Z41 450 


Hs 130489 


"ATPase, aminophospholipid transporter-l 


0.86 


0.82 


114407 


AAO101RR 


Hs 103305 

no. i uoouo 


ESTs 


0.8 


1.88 


114471 


AA09R074 


Hs 104613 

1 ID* 1 1 O 


RP42 homolog 


1.06 


1.34 


1 1 4509 


AA043551 
rvwtooo i 


Hs.101799 


KIAA1350 protein 


1.82 


2.32 


1 1 50R0 


AA953914 


Hs 198249 


"Gap junction protein, beta 5 (connexin 


0.79 


1.49 


115091 


AA955900 


Hs 184523 


KIAA0965 protein 


0.72 


1.92 


11519*} 

I 10 IZO 


AA95RR49 

rtMZOOO'rZ 


He 936894 


"FSTs Hiah sim to LRP1 hu low densitv I 

1 — \J IO| 1 nyii villi lw Ll\l 1 1 III lun VAOHwHj 1 


0.59 


1.97 


115901 

1 1 ozs i 


AA97Q943 


He 1 99579 


ESTs 


1 


1.25 


11550R 

1 1 00UO 


AA999537 
MMzyzoo/ 


Hs 45207 


HvDothetical Drotein KIAA1335 


1.15 


1.48 


115599 
I IOOZZ 


AA1113Q3 


Hs 47378 
no.Hf of o 


ESTs 


0.5 


3.29 


1 1 0000 


AA1471 93 
AAOtf i yo 


Hs 62180 


ESTs 


1 


1 


1 1 5697 


AA411509 

rVAt I I Jut 


Hs 63325 

( IO»V/00£-0 


Homo sapiens type II membrane serine pro 


1 


6.53 


1 1 5Q0Q 

1 1 oyuy 


AA4^RRfifi 

MAIOOOOO 


Hq 5Q761 


ESTs 


1 


6.98 


1 1 5Q7R 


AA447599 
/H/W*f /OZZ 


He RQ517 
no. oyo 1 / 


niffprpntiallv pynrpeepH in Fanrnni anem 
uiho o many cApicoacu jij i auuoiii anvm 


1 


2.31 


11R09R 


AA4591 19 

rWKJZ I J Z 


Hs 42644 


thioredoxin-like 


0.99 


1.68 


116107 


AA456QRR 


Hs 92030 


ESTs 


1.14 


1.8 


11R134 
1 ID IOH 


AA4R094R 


Hs 50441 


GGI-04 protein 


1.11 


1.86 


11R157 
I ID 10/ 


AA4R10R3 


He 449QR 


Hvnnthptinal nrotptn 

J iy poll IwllvCU piwiwMi 


0.99 


1.9 


11R15R 
1 ID IOO 


AA4R11R7 


Hs 61762 


Hypoxia-inducible protein 2 


0.44 


0.86 


I 1 0000 


AA4Q5R1f) 
tt/v*yooou 


H<i R701 3 
no. Of u i o 


"Homo saniens cDNA FLJ 10238 fis clone H 


0.62 


3.89 


11R4R'} 
i lOHOO 


P140Q9 
o i tuyz 


Hs 76118 


Dhiniiitin narhnyvl-tprminal p^tera^e L1 


1.04 


2.36 


117^90 
1 1 / OZU 


M9393Q 

iNzozoy 


He 911002 

nb.Z I IUoZ 


LUNX Drotein* PLUNCfDalate luna & nasal 


0.51 


0.64 


1 1 7557 
I I / OO f 


N33990 
ixoocjzu 


Hs 44532 


Diubiquitin 


1.11 


2.63 


1 1 7RQ'3 
1 1 /oyo 




He 119110 
no* i i£i iv 


PTD007 Drotein 

1 lUUvl piuiwlll 


0.98 


1.79 


117RR1 

I I / 00 I 


M50073 

NOUUf O 


He 960699 


Rnh/ratp-inHiiRprf tranenriot 1 

LJUiy i oit^ ii uubcu uallOMiipi i 


1 


1.43 


1 1 RQRR 
I 10000 


iNOHOoy 


He 4RQ56 
no.*toyoo 


ESTs 


0.67 


2.86 


11P5KR 
I 1 0000 


MRR55R 
IN OOOOO 


He 49824 


HvDothetical Drotein FLJ10718 


1.21 


0.83 


11RRQ5 
i looyo 


N71781 


He 50081 
no.ouuu i 


KIAA1199 see CVA7.doc 


0.88 


1.63 


1 1 0700 


vv/zyo/ 


He 1Q13R1 
no. I y loo I 


F^Te* Wpaklv eimilar in hvnnthpfiral nro 
coia, vvcarxij on itucu 10 iiypoiticiiuai piw 


1 


1 


119845 


VV f 


Hs 58561 


G protein-coupled receptor 87 


1 


1 


IZU IUZ 


WQ549R 

vvyo*fzo 


He 139Q97 
no. 1 oZ9z/ 


"FSTe Mndpratplv etmilar tn n53 rpatilat 

Lulu, IVIOUvl Qlv?ljf OLIJIIICIJ IO yJ\J*J l^yUJCJl 


1 


1 


190104 
IZU IU4 


\A/Q5477 
vvyo*+/ / 


Hs 1R0479 
no. i out / o 


ESTs 


0.69 


3.07 


1904RR 
IZUnOO 


AA95^4nfl 


Hs 137569 

no. ioi ouo 


Tumor protein 63 kDa with strong homolog 


1.08 


12.05 


190R5Q 


AA^5015R 

rvAOOU I OO 


Hs 1619 

no. ivis 


Achaete-scute complex (Drosophila) homo! 


1 


1 


190RRO 
IZUOuU 


AA1R0940 


He 97019 
no. o i \j i cj 


EST 


1 


1 


120948 


AA^Q7R99 


Hs 104650 


Hypothetical protein FLJ 10292 


1.04 


2.15 


1 90QR? 

IZUOOO 


AAQQR90Q 

MMuyozuy 


Hs 97587 

1 15.31 OOI 


EST 


1 


1 


19nR9 
IZ I ooz 


AA40R5nn 


Hs 97939 


P.hnnHromofiuiin 1 nrpnnreor 


1 


1 


191 IRQ 
iz iooy 


AA405R57 

MAHUOOO/ 


Hs 128791 
no. i tui c i 


CGI-09 protein 


1 


1.8 


1917Q1 
i z i / y i 


AA49^Q7fl 
AAtzoy / o 


Hs 293317 


"ESTs, Weakly similar to JM27 [H.sapiens 


1 


1 


193005 


AA47Q79R 

A/Vt/ y/ ZO 


Hs 105577 
no. i voof i 


ESTs 


1 


1 


19*1044 

IZ0U4*t 


AA4R154Q 
rv\HO i oty 


He 130881 

no. i wuou i 


B-cell CLL/lymphoma 1 1 A (zinc finger pro 


0.95 


1.88 


I ZO I ou 


AA4RRRR7 
MAHOOOO / 


He 9R4935 
no.^ot^oo 


ESTs 


1.59 


4.98 
1.64 




M/AoyyHoy 


He 135056 
no. I OOUOu 


rlnnp RP5-fi5flF9 nn rhrnmneomp 2d 
Kjiuwg rvrirUJULa on i/iiiouioooiiic 


1.19 


19Q571 


a AR0RQ5R 
MAO u oyo 0 


He 119619 

no. I 1 ZO l a 


"FSTe Wpaklv similar to PO0109 Purkinie 


1.03 


1.14 


19*}fl9Q 

izoozy 


AAR90RQ7 
MM0ZUO3 / 


He 11990R 
no. I l zz uu 


XAGF-1 Drotein 

/\nvL I fjiuiciu 


1.39 


2.2 


19400R 

IZOUUO 


LJ\J\J<J\J£. 


Hs 108977 


ESTs 


1 


4.85 


1 9/05Q 


F1^R7^ 

r too/o 


He QQ769 
ni.y y / us 


ESTs 


1.49 


8.62 


194QR0 

iz^you 


1 IOOOO 


He 194766 
no. i o*tf ou 


Ppi7itrp rplatprl dphp 6 fmnu^pVliks 

wol£.UI\7 1 wICUwU ^wltv O yillUUOwy iiixw 


0.76 


0.77 


125218 


VV/ OOO i 


Hs 110024 


NADHiubiquinons oxidoreductase MLRQ subu 


1.33 


1.77 


I Z0400 


ROR041 
r\U0UH I 


He 1 R04R 
no. IOUHO 


"Mplannma anfinpn familv A 10" 

IVIOIal lOlt let allUyCil, ICtlllll/ r\ t lu 


0.8 


1.42 


1 25759 


AA4955R7 


Hs 82226 


Glycoprotein (transmembrane) nmb 


1.52 


2.26 


195979 
I z oy / z 


AA4^45R9 


Hs 35406 


"ESTs, Highly similar to unnamed protein 


1.05 


2.48 


195QQ4 

izoyy** 


HR57R9 
noo/ oz 


He 970799 


EST 


1 


1.95 


tzooyo 


M701Q9 
In r U. I y z 


He 97R95R 
no.z/ oyou 


Hvnnmptiral nrotpin Fl .112929 


1 


1.35 


19RR45 
IZ0D40 


AI1R7Q49 
r\l i o / ytz 


He 61635 


STEAP1 (Homo sapiens BAG clone RG041 D1 1 


1 


2.23 


197991 
IZ/ZZ 1 


AI^R4^^9 

MIOOHOOZ 


He 791R5 
no. f ZOOO 


ESTs 


0.73 


3.27 


19747Q 
JZf **/y 


AAR1^799 

MMO I 0/ ZZ 


He 179799 
no. i / c?i 


nnllanpn* tvnp X* alnha 1 fSnhmid metaDh 


0.51 


1.94 


izoiyz 


AI90494R 
AlZU*tZ4D 




klAA10R5 nrotpm 


1.8 


3.16 


19RR10 
IZOOlU 


I ^RROR 
LJOOUO 


He 10947 
no. I UZh-/ 


apHuatprl Ipi MWi/tp rpll arfhpeinn mol PC I J 
dOUVdLwO icuvouyit? ucn ouiicoioji iiivyi^wu 


0.89 


0.97 


1 9R777 
I ZO/ / / 


U40UU0 


He 1059R 
no. i uozo 


Pv/etpirP anrl nlvrinp-rirh nrotpin 2 

wyOLCIIlw alio yiyuinc muii piuicin £~ 


1 


1 


1 9RQ94 

izoyzH 


AA9^4QR9 

ArtZOHyOZ 


He 9R557 
no. ^000' 


Plaknnhilin 3 
i icirvopiiiiiii \j 

"Rnlntp rarripr family 9 ffarilitated al 

ouiulc oaiiid laiiiiij c yjauiiiiaiou yi 


1.3 


2.97 


19Q0A1 


HRRR7^ 

nooo/ o 


He 1RQQ09 
no. i oyyuz 


0.84 


2.04 


i zyuyy 


nouoyo 


He 10RRR0 
no. i uooqu 


"ATP-hindina caeeette sub-familv C fCFT 

r\ 1 i UlllUiny oaaocuc, uuitioiihij \j i 


0.87 


1.04 


1 29404 


AA1 72056 


Hs 111128 


ESTs 


1 


1 


129466 


L42583 




"Genbank Homo sapiens keratin 6 isoform 


0.72 


12.67 


129605 


S72493 


Hs.1 15947 


Keratin 16 (focal non-epidermolytic palm 


0.92 


1.5 


129628 


U26727 


Hs.1174 


"Cyclin-dependent kinase inhibitor 2A (m 


0.85 


1.93 


130023 


X13461 


Hs.239600 


Calmodulin-Iike 3 


0.84 


1.22 


130080 


X14850 


Hs.147097 


"H2A histone family, member X" 


0.98 


1.96 


130385 


M126474 


Hs.1 55223 


stanniocalcin 2 


1 


1 



86 



WO 02/086443 



IOU4 IU 


vni5iA 

VU IO 14 


no. i oo*t<j. 1 


A in h o_/*of A ri rrt fain 

Aipiid-ieiupruimii 

"Witmon HMA PI/ mPMA norfial r>rle" 

numan uiMA-rh. iiikna, paniai cos 


0.63 


I0U441 


UOOOOO 


He qni^R7 
ns.ou 100/ 


1 15 
l. IO 


1 R0AR9 
IOU40Z 


I 39RRR 


He 157R 
no. IO/O 


Baculoviral IAP repeat-containing 5 (sur 


1 


i ououo 


aa4ioor9 

AA40UU0Z 


He 9595R7 


Pituitary tumor-transforming 1 


0.92 


130577 


\VI J 1 u 


Hs.162 


Insulin-like growth factor binding prote 


1.17 


130627 


1 9RR0R 


He 1RQ5 


Matrix metalloproteinase 12 (macrophage 


0.69 


lOUOUU 


A A99Q9PR 
AAZZOOOO 


He 1QR74 
ns. 1 yo/ H 


ESTs; Weakly similar to katanin p80 subu 


1.13 


1R.0QRO/ 

1 OUCJOiJ 


AA5QRRRQ 


He 21400 


ESTs 


0.8 


1R1 OAR 

1 O 1 U4D 


yn95R0 

AU£OOU 


He 994R 


INTERFERON-GAMMA INDUCED PROTEIN PRECURS 0.8 


131244 


UOQKJ 1 O 


He 947fi^ 
noxti uo 


RAN binding protein 1 


1 1^ 

1. 10 


131877 




He 1 *i6346 


1 opoisomerase ^uinaj 11 aipna \ \ /ukuj 


i 


131927 


AA4R1 54Q 


He 14780 


"Doublecortex; lissencephaly, X-Iinked ( 


U.81 


131QR5 


WQ014R 

V V<J\J 1 tu 


He I^Qfi? 


to IS 


O 7 A 
U./4 


1R1Q7R 
IO IS/ 0 


nsnnoR 


He qfi9^9 
rlo.OD&OZ 


KiAAuioo gene product 


A 
1 


I0Z004 


1 nciQ7 

LUO 1 0 / 


He 91101^ 

ns.z 1 iy io 


Small proline-rich protein 1A 


O RQ 

u.oy 


I OZ040 


AA4171 59 
AA4 1 1 1 OZ 


He 51 01 
ns.o iu 1 


ESTs; Highly similar to protein regulati 


n 7Q 


1R9RR9 


M5Q7RA 
IMOj/04 


He 5^QR 

ns.ooyo 


guanine-monophosphate synthetase 


i 




1 I*}1901 
UO IZU I 


He RAAZi 
nS.0440 J 


"laminin gamma2 chain gene (LAMC2), exon 


H 
1 


1R9RRQ 


7751 on 
Li 0 iyu 


He KAAM 
no. 0440 1 


"Low density lipoprotein receptor-relate 


O RQ 

u.oy 


IOZ/ IU 


WQR79R 

vvyo / zo 


He 5597Q 
rto.ooz/y 


"Serine (or cysteine) proteinase inhibit 


0 R4 
U.04 


1R97ER 

loz/oo 


\A/59vlR9 
W0Z40Z 


He RRIflR 

nS.OO iUO 


flCOTn \JWnnM.. oimilor (A/PlMKn DAT lA/nMK/M 

ho I s, weaKiy similar to wunm ka i wunmi 


1 55 
1.00 


1 R97R7 
IOZ/0/ 


I 051RR 
LUO I 00 


He 9^1fi99 
no. ZO lO^i 


Small proline-rich protein 2B 


U.00 


1R9R1R 


M745A9 
[VI / 404Z 


He R75 
nS.O/O 


Aldehyde dehydrogenase 3 


O 55 
U.00 


1R9QQ0 


AA45R7R1 

Art H DO/ O > 


He 1R1R7 
no. 1 u*ju / 


transcription factor AP-2 alpha (activat 
M A disintegrin and metalloproteinase dom 




1 "3R070 

I OOU / u 


IIRQfiH 

UU30 1 1 


He R4.qi 1 
mo.oho 1 1 


1.10 


133282 


U59QR0 


He 9RR145 
no. zoo 1 ho 


"SRB7 {suppressor of RNA polymerase B, y 




133317 


AA9159QQ 

AA£ 1 QZyy 


He 7nfl^0 
no. f uuju 


U6 snRNA-associated Sm-like protein LSm7 


O OK 

u.yo 


1 ^RR70 

1 000 / u 


AA15RRQ7 

Art 1000?/ 


He 79157 

Mo, / Z SO/ 


Hnmn eartlone mRMA' r«nWA nk'P7nC;RilMQ99 

nomo sapiens mKiNA, cuina ui\rzpoo4i iyz^ 


1 19 
1. IZ 


1TV5Q1 

i oojy i 


AO I 01 J 


Hs.727 


H.sapiens activin beta-A subu nit (exon 2 


1 R5 
1.00 


100009 


H0RRR7 


He 9411fl5 
no.z*f 1 oifu 


estrogen-responsive B box protein (EBBP) 


1 r»9 

I.UZ 


1 34032 


Z81 326 


He 7fi c iRQ 
no. r o«juo 


"Serine (or cysteine) proteinase inhibit 


-| 


1 RA1 RR 
1 04 1 DO 


A AQQRQnR 
AAuaOyUO 


He 1R1fiq4 
no. IO 1 004 


nomo sapiens cuina. rLJZoouz tis, cione 


n 05 
u.yo 


1Q/J91R 


AA997/R0 
AAZZ f 40U 


He fln9H5 

ns.ouzuo 


Pim-2 oncogene 


1 RR 
1.00 


I044U0 


DR707X 
KO/Z/O 


Hi? P9779 
nS.oZ/ / Z 


"""collagen, rype XI, alpha 1" M " 


O 7fi 
U, fO 


104400 


V7riRR7 
Af UOOO 


He flq4R4 

□ 0.00404 


SRY (sex determining region Y)-box 4 


1 QQ 

i.oy 


J044/U 


A04y4Z 


He R^75R 
ns. OO/OO 


CDC28 protein kinase 2 


1 Q9 

T.oZ 


104040 




He 1R7T7Q 
no. IOr Of 0 


oancer/testis antigen (NY-bou-i, ui Abi, 


O 09 
U.OZ 


19^470 1 
104/0 1 


Ml / (00 


He R0R9R 
ns.oyozo 


Parathyroid hormone-like hormone 


4 
I 


IO0UUZ 


I I1Q1/17 
U l» 14/ 


He 9794R4 
nO.Z/ Z404 


G antigen 6 


1 


100040 






AFFX control: STAT1 


n no 

u.yz 


101201 


L22524 


Hs 2256 


matrix metalloproteinase 7 (matrilysin; 


9 Q9 

z.yz 


I U 1004 


Mfif1759 
IVIOU / oz 


He 131017 
no. it iu 1 / 


H2A nisione family; member A 


4 

\ 


102025 


UU03 1 1 


He 7RQq4 
no. 1 usut 


mutS {E. coli) hornolog 2 (colon cancer; 


U.o 


1090^1 
tuzuo I 


1 IfMRQR 
UU40yO 


He 91 5R 

ns.z ioo 


RAR-related orphan receptor A 


1 


109991 

JUZZZ I 


l|9^C7R 
UZ40 / O 




LIM domain only 4 


-1 
1 


109970 
I UZZ f U 


UOUZOO 


He 75RRR 
ns. / 0000 


phosphogluconate dehydrogenase 


1.08 


I UZ003 


1 IT7H99 
UJ/UZZ 


He Q5577 
ns.yoo/ 1 


cyclin-dependent kinase 4 


0.88 


109RQ1 

luzoyi 


U4 I 000 


He 774Q4 

ns,/ /4y4 


deoxyguanosine kinase 


1 r»7 
l.U/ 


lUoUUU 


AO i yoo 


ns. 14000U 


enolase 2; (gamma; neuronal) 


u.yi 


luooyo 


Ay4/04 


Ue 11Q5fiq 

ns. i lyouo 


methionlne-tRNA synthetase 


u.uy 


lUoboo 


A A0Q1 RQQ 

AAZoioyy 


ns.zu4io 


Homo sapiens mKNA tor tor nistone nzd, c 


ft Q1 

u.yi 


1 UO / Z0 


A A9Q9'39fl 

AAzyzozo 


He 0754 

ns.y/04 


activating transcription factor 5 


0.94 


1 1 /Q/H 
I 1 404 J 


AA9*5/1799 
rt/\£04/ ZZ 


He 554DR 
nS.D04U0 


ESTs; Moderately similar to uALLlUM-DtPh 


0.78 


11590R 


AA9R94Q1 


He 1RR579 

no. 1 OOO/ Z 


to IS 


i 


115Q0R 


AA4^RR1fi 
AA4000 IO 


He fl9^n9 
ns.ozouz 


to 1 S 


n 7 a 

U./4 


11Q1R9 

j ly ioz 


K^yU40 


He 1f»7Q11 

ns. iu/y 1 1 


ATP-binding cassette; sub-family B {MDR/ 


1 1 

1.1 


19/.IR0 
1 Z4 1 00 




He 1RQR1R 
no. 1 osojo 


to 1 s 


I 


19RAR7 
IZ040/ 


AAAR9*inR 
rtr\40Z0U0 


He 1R4RR1 
ns. 1040U 1 


solute carrier family 7 (cationic amino 


l.U J 


197141 
JZ/ 14 1 


AAq07QRn 
AAOU/ you 


He 7547R 
ns./ 04/ 0 


NAAuyob protein 


(\ RK 
U.00 


19R0RA 

IZOU04 


AAQPK7M 

rtrtyUO/ 04 


He 751 Oq 
no. 1 0 1 UJ 


tyrosine 3-monooxygenase/tryptophan 5-mo 


A 

1 


190ROQ 


AA9'3/1 , 3RR 
rtAZ04u00 


He 10945R 

ns. iuz4uo 


survival of motor neuron protein interac 


1 


1 9 QRQc; 


rco/ /OO 


He 1DRQR5 

ns. luoooo 


ESTs 


!./ 


iroiqq 
iou iyy 


248579 


He 17909ft 
no. 1 1 tuLij 


a disintegrin and metal loprotease domain 


•i 


1R059A 

1 0U0Z4 


uoyyyo 


He 15Q91A 
nb. 1 oyzo4 


forkheaa box El 


A 
1 


IOOUUU 


1 1941 R9 

UZ4I0Z 


He R9409 

ns.oz4uz 


p21/Cdc42/Rad -activated kinase 1 (yeast 


A 
1 


1 OOOOO 


IVJ£0/30 


He 75496 
no./ 04^0 


secretogranin II (chromogranin C) 


A 
1 


10DU4/ 


AAylRO/IRR 


He QQR07 

ns.yooy/ 


to IS 


4 
I 


IUUUOO 


IVIZ/OOU 




ACCVnirlml. OQC rlknMmnl DMA 

ArrA control: zoo nbosomai una 


U.oo 


1001 1/1 

IUU l 14 


uuuoyo 


He R9QR9 

ns.ozyoz 


thymtdylate synthetase 


U.oo 


100128 


D11094 


He R1151 
no.u l 100 


proteasome (prosome; macropain) 26S subu 


1.29 


IUU J 04 


ni/R^7 

U 1400/ 


He R1RQ9 

ns.o loyz 


KIAA0101 gene product 


0.71 


100161 


ni4RQ4 
Lf IHOSf 


He 7719Q 
no. / / ozy 


phosphatidylserine synthase 1 


1.02 


lUUlOo 


ni4R7A 
U 1 40 / 4 


He 

ns.oy4 


adrenomedullin 


0.46 


10MR7 
IUU 10/ 


ni77oq 


He 7R1R^ 

ns./o 100 


aldo-keto reductase family 1; member C3 


1 


1001RR 
IUU too 


L/Z 1 UOO 


He 57101 

ns.o/ iu 1 


minichromosome maintenance deficient (S. 


0.97 


100217 




He RQ54.5 


proteasome (prosome; macropain) subu nit; 


1.13 


mri990 

IUU/ZU 


UZ0004 




"""Human mRNA for annexin II, 5'UTR (seq 


1.11 


mn9R7 


nAqoqn 
u»toyou 


He 1R00 
ns. iouu 


chaperonin containing TCP1; subunit 5 (e 


1.13 


100297 


D49489 


Hs. 182429 


protein disulfide isomerase-related prat 


0.92 


100330 


D55716 


Hs.77152 


minichromosome maintenance deficient (S. 


1.07 


100355 


D78129 




"""Homo sapiens mRNA for squalene epoxid 


0.96 


100364 


D78586 


Hs. 154868 


carbamoyl-phosphate synthetase 2; aspart 


1.49 


100368 


D79987 


Hs. 153479 


extra spindle poles; S. cerevisiae; homo 


0.59 


100398 


D84557 


Hs.1 55462 


minichromosome maintenance deficient (mi 


1.08 


100438 


D87448 


Hs.91417 


topoisomerase (DNA) II binding protein 


1 



WO 02/086443 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



IUU400 


D87953 


Hs.75789 


iV'lllyu tJUWtlbll cdi 1 1 IcyuidlcU 


0.91 


1.48 


iuu*ty i 


HG1153-HT1153 




0.99 


1.41 


IUUO 10 


HG174-HT174 




Ucbl nOpidKlil 1 


1.28 


3.17 


IUUuZO 


HG1828-HT1857 


INcaIIIi Olla-UtJI IVUU 


0.68 


1.9 


IUUD0 1 


HG2874-HT3018 




Rihncnmal Prnfpin ! Wnmnlnn 


1.1 


5.44 


lUUoo/ 


HG2981-HT3127 


tpican, mil opuce 1 1 


0.8 


1.97 


IUU00U 


HG4074-HT4344 


Rad2 


1.01 


2.12 


IU IUO 1 


K03515 


Hs.944 


yiuuubc [Ji luopi laic louiiiuiaoc 


0.91 


1.79 




L10838 


Hs. 167460 


b|JllLltiy IdLLUl, diyil llllc/bcllllc-l 1011 0 


1.23 


1.87 


1011R9 
IU I 1 OZ 


L14595 


Hs. 174203 


Qnli ito rarripr family 1 fnliitamatp/npntr 
ouiuic odl I ICI i di liny i ^yiuioiiiaic/ncuu 


1.35 


2.73 


mufti 


L19686 


i i- 707/10 
Hs.73798 


maprnnhano mJnraffnn Jnhi'hffriiv farfnr ( 
1 1 idiji upi lay c iiiiyicHiuii iiiiiiuiiuiy lauiui ^ 


1.03 


1.78 


mi iRq 

IU I 100 


L19779 


Hs.795 


H9A hicfnnp familv mpmhpr O 

\ \£T\ lilolUlIC ICHIIIIj! IIICIIIUCI W 


0.57 


1.3 


101216 


L25876 


1 1 _ 0 A A A O 

Hs.84113 


rvrlin-ripnpnHpnt kina«;p inhihitor 3 ?CDK 

vj^vlll I^JCJJd iuci ii rvi i logo jiiuiuii^/i w y wlji \ 


OJ 


2.2 


101228 


L27706 


Hs.82916 


rhanprnnin rnntaininfi THP1* ^uhunit 6A I 

\j\ idpui unii i ^ui Kail ii ny i \jr 1 1 ouuuiiii un ^ 


0.99 


1.99 


101233 


L29008 


Hs.878 


snrbif/i) HphvHronpns^p 

OsJIUHSJl UCTIjr ui l/yC( lOJC 


0.82 


2.11 


10,1947 
IU IZ*H 


L33801 


Hs.78802, 


n\\ipr\riGn ^unfhaQP kina^P ^ hpta 
yiybuycii oyiiuiaoc ruiidoc o ucia 


1.2 


1.91 




L47276 




"""HrnTin can!pn<s ^rpll linp HI -R^ alnha i 


0.69 


2.78 


IU I04Z 


L76191 


Hs. 1820 18 


Infprlpt ikin-1 rpppntnr-aQQnfiatprf kina^p 


1.04 


1*84 


■*iPi1*3QR 

luioyo 


M15796 


Hs.78996 


nrnlifcirotinn noil ni tr>lpar anfinon 

piuiiieiaiiny uon iiuuicdi diiuycii 


0.95 


3.55 


ipi1/9*5 
1U14Z0 


M18391 


Hs.89839 




1 


1.5 


1U1440 


M21259 


Hs.1066 


small nuclear nuuiiuuieuproiein puiypapi 


1.21 


1.96 


luloUo 


M27396 


Hs.75692 


dspdrayiiic byimicidbc 


0.93 


1.6 


101525 


M29536 


Hs.12163 


eukaryotic translation initiation factor 


i. 19 


1.93 


IU 1000 


M30448 


Hs.251669 


Aacoin Idnaco 9' hpfa nntunontlHp 
Odbclll MJldbC C, ucia puiypcpilUu 


0.96 


1*42 


1Pi1RPi7 
IU lOUf 


M38690 


Hs.1244 


rTlQ anfrnpn (r\9A\ 
\jUU dllUycll \\Jc.H} 


1.11 


1.25 


IU IOZ4 


M55998 




"""Wiiman alnha-i pnllanpn fx/rip 1 npnP ^ 
nUIMdll dipilcrt UUlldycll Lypt; I yuiic, o 


1.17 


1.98 


1U1 /Do 


M77836 


Hs.79217 


pynunnc-o-udnjUAyidic icuuuiaoc i 


1.77 


3.45 


imfi^Q 
iu looy 


M93036 


Hs.692 


mpmhranp rnmnnnpnt' rhrnmnQnnnal 4" ^nrfa 
I iiui iiui di its uui i ipui ici ii, i/i iiuiiiuoutiiai oui ia 


0.71 


1*45 


lUloOo 


M94362 


Hs.76084 


Idlllin DZ 


0.84 


1.19 


IU 13/ ( 


S83364 




"""nufaflup Rah'vintprar'finn nrntein (c\ 


0.89 


1.9 


1/11009 

iu iyyz 


U01038 


Hs.77597 


pUlU yUl UbUpilldJ-lll\c Kllldbc 


0.66 


1.46 


i uzuuy 


U02680 


Hs.82643 


nrntoin turncinp klnacp Q 
piUlcIN lylUblllc isilldac v 


1.23 


3.35 


lUzUIZ 


U03057 


Hs. 11 8400 


einnoH /nmennh5la\_lil/o /coa iirphtn fac 

biiiysa ^urubupuiid^-iiKo ^bed uiuiiiii id%> 


0.85 


1.88 




U05861 


Hs.201967 


atHn L'ptn rprlt irfacp familu 1 • mpmhpr fl 
dlUU-rvclU icUUOldoc Idlllliy l| IIJClllUcI 


0.93 


2.32 


10.919*5 
lUzlzO 


U14518 


Hs.1594 


nan\mmara nrnlpln A /'17L*^^ 

i/ciiuuiimic piuiciii m \ i / wu) 


1 


428 


■inoH *5f\ 


U 15009 


Hs.1575 


email ntirlocir rihrtniipfannrAloin IT5 nnli/n 
SlTldU JlUUcdJ JjUUUUUlfcJUpi OICU1 uo puiy|J 


0.89 


1.42 


iuZ14o 


U16954 


Hs.75823 


Al 1 1 fueoH nana frnm chmmncniTio 1n 
MLLI-TUbcU ycllc UUIM UIHUMIUoUillt; m 


0.8 


2.95 


lUzzlU 


U23028 


Hs.2437 


eUKdryOlIC UdMbldllUll IlllUdllUII IdLrlUI 


1.01 


1.34 


10,9990 
lUZZzU 


U24389 


Hs.65436 


lx/c\/l aviHqcp IiUp 1 

iybyi uxiudoc-iiixc i 


1.15 


2^34 


1099R0 

IUZZ0U 


U28386 


Hs. 159557 


karunnhorin alnha 0 fRAf5 rnhnrt 1" imnnr 
r\di yupilcllll dipitd c. ^r\Mv3 tiUMUll >■ uiipui 


1.14 


2.69 


109*5*50 

luzooU 


U35451 


Hs.77254 


rhmmnhnv hnmnlnn 1 /nrncnnhlla HP1 hpta 

uiiiuniuuoA nuinuiuy i ^uiubupintd nr i uuid 


1.05 


1.7 


10*3/9*5 
1UZ4Z0 


U44754 


Hs.179312 


email niiHpar RK)A apHuatinn rnmnlpy nn 
ollldll lluulcdl rxlMrt dUUVdUliy ouilipiCA, pu 


1.14 


2.99 


IUZ40O 


U48705 


Hs.75562 


riiefrvrlin rlr^rviatn roj^QrtlAr famiKi* momhcir 

□ibuoiuin uomain rccepior idiiiny, inuniutji 


1.05 


2.01 


109AQQ 
juz**yy 


U51478 


Hs.76941 


ATPasp* Ma+/k'+ tranennrtinn* hpta 3 nnlv 

Ml r doc, iNd^VfN" u ai lopui ui iy, uoio u yyjiy 


1.27 


1^92 


IUZOZZ 


U53347 


Hs. 183556 


cnlnto rarripr familu 1 fnpntral amino a 
ouiuic Udi l id laiiiiiy i \iicuudi autiiiu a 


0.84 


1.31 


IUZ03U 


U62136 




"""Hnmn <sanipne pntprnrufp riiffprpntiati 


1.11 


1.6 


i uzu > o 


V725 14 


Hs. 12045 


niifafit/p nrntein 

pUldUVC (JIULCIII 


1.04 


2.17 


io9RP7 
luzoov 


U73379 


Hs.93002 


rihiniiltln parripr nrrtfpin PO-f" 4 

uuiyuiiKi udinur piuiciii cz-L/ 


0.86 


2.28 


10970A 

juz/u^ 


1 i9^^on 

U76638 


Hs.54089 


RRPA1 accnriafpH RIMn rlnmain 1 

Dlxvnl daoUOIdlCU r\ll>IO UUIIIdlll l 


1.12 


1.63 


1097ft1 
JUZf 0 I 


U83843 




"""Human Wl\/-1 Mpf inlprantinn nmtpin t 
nuiiidii niv- i iNoi iiHcidouiiy piuiciii ^ 


0.9 


1.39 


1097ft/ 

lUZ7o4 


U85658 


Hs.61796 


UdJ)oL}ipilL>n IdOlUJ Mr-Z ydllJIIld ^dULivdl 


0.98 


2.16 


ioon97 
lUZoz/ 


U91327 


Hs.6456 


phanprnnin rnnfaininn TPP1* Qiihnnit 9 (h 
i/iidpcruniii LUiiidiiiiiiy i or i , ouuuiiii c \u 


0.96 


1.62 


nuzyoo 


X13482 


Hs.80506 


email nuptaar r5hnniir»lpnnrrtfpin nnlunonf 
oiiidii nucicdr iiuuiiuuicupiuiciii puiyfjcpi 


1.21 


4.2 


109Q79 

i uz y / z 


X16662 


Hs.87268 


annovin Aft 
dllllCAlll MO 


1.25 


2.32 


10900*5 
tuzyoo 


X17620 


Hs. 118638 


IIUII~lllcldbLdUl> Ocllb I, piULclll \l\JVItur\^ 


1.03 


1.83 


10*509*5 
IUJUZ0 


X53793 


Hs. 11 7950 


miiHifnnnlinna) nnlunpntlrlp similar tn S 
iiiumiuiioiiuiidi puiypcpuuc oiunidi iu o 


1.58 


5.44 


lUdUOO 


X54941 


Hs.77550 


PHOOft nrnfpin ktnacp 1 
\j\J\j/.0 piulGlM Mlldbc I 


1.32 


3.79 


103075 


X59543 


Hs.2934 


riDunucicuuuc rcuuuidsc rvi i puiypcpuuu 


1.11 


2.58 


ioqirp 

lUOlOO 


X68314 


Hs.2704 


nlnfafhinnp nprnvirlaep 9 /naetrnintpetin 
yiuiaimunc pcruAiudoc z ^ydbuuiiiicoun 


0.75 


3.05 


103185 


X69910 


Hs.74368 


iransnrisnriDrane proiein ^ooku^, ciiuupiasrni 


1.01 


1,97 


10191 9 
luoziz 


X73874 


Hs.2393 


pnuopnuryidbo tvuidbc, dipiid i ^iiiuijum^ 


0.95 


1.72 


10*599*5 


X74801 


Hs.1708 


rhanprnnfn rnnfalnfnn Tf^P1* ^uhnnff 3 in 

\jl lap el Ul IU I uui nan in iy i vr i , ouuunii \j 


0.97 


1.77 


IUJZDU 


X78416 


Hs.3155 


paQpin* alnha 
bdoclll, dipiid 


1 


1 


10Q9R9 
lUOZOZ 


X78565 


Hs.204133 


llCAdUidLIMUIl ^IcIldbL-ltl vy, uyiUld\*UIU 


1.23 


3.09 


IUOOOU 


X85373 


Hs.77496 


email nnplpar rihnniipipnnrpitpin nntvnpnt 
ollldll liuuicdi i iuui luuicupi uicn i puiypcpi 


1.12 


2.25 


in*3*5C/l 

1Uoob4 


X90872 


Hs.75854 


CI II Tit* entf/^lranefar'jca 
OULI lubUliOUdflblcrdbu 


'2.85 


4.62 


10*5*57K 
lUJOf 0 


X91868 


Hs.54416 


einp ripnlie hnmpnhnv fTlrriconhila^ homnlo 

OlIIC UuUILD IIUII1CUUUA 1 U\ UOwpl IIIOJ iiuimuivj 


1 


248 


luooyi 


X94453 


Hs. 114366 


pyrroiin6-o-cdruUAy(dio &yniiicidoo \yiui 


1 


1.53 


m*5viovi 

1 U04U4 


X95586 


Hs.78596 


nrnloopnnnD /nrftCfuno' maprnnain^ eilhimit' 
piUlcdbUII 1c ^piUbUlllc, IlldUl updiliy ouuutlK, 


0.92 


1.53 


10Qj*i*57 

lU0407 


X98260 


Hs.82254 


ivi-pndse pnuspnopruioin 1 1 


0.92 


1.54 


10*5iMft 


X99133 


Hs.204238 


iipuudini z ^uiibuyuiiy ztpoj 


0.55 


0.96 


luobuo 


7*50100 
Z.OD4UZ 


ns.iy4oo/ 


nir\hr\rirt 1 * C nnr\ horin /anlfhalial\ 

caunenn i, tz-cauncnri ^cpiuiciidi; 


1.32 


2.51 


10*5RytR 


Z68228 


Hs.2340 


juiiL-uon pidnuyiuuiit 


0.88 


1.28 


TUdOOo 


Z74615 


Hs.172928 


conagen, iypc i, dipna i 


1.06 


2.98 


103774 


AA092898 


Hs.9291B 


CQT e . \Maalt\\t clmllar in Rn7r5*^ R fC. plana 


1.88 


4.66 


10/I9R1 
1U4Z01 


AF008442 


Hs.5409 


DM A nnlwiYiQracD 1 citHimit 
PtiNM puiyiMcldou 1 bUUUlllt. 


0.87 


2.17 


10/97R 
IU4Z/0 


C02193 


Hs.85222 


F^Tc W/paklv efmilar in R970QO 2 TH earn" 
C.O I b| vvcdMy oil i nidi iu r\t-i U3u_t \t t.odpi 


1.4 


2AQ 


10;19QQ 

iu4zoy 


C16281 


Hs.75478 


KIAAnQ^R nrntpln 


1.15 


1.68 


104434 


L02870 


Hs.1640 


coliagen; type VH; alpha 1 {epidermolys 


ll04 


149 


104453 


M19169 


Hs.123114 


cystatin SN 


0.38 


0.76 


104611 


R98280 


Hs. 125845 


ribu!ose-5-phosphate-3-epimerase 


1.08 


2.25 


104758 


M024661 


Hs.7010 


ESTs; Weakly similar to ACYL-COA DEHYDRO 


1.14 


1.65 


105114 


M156532 


Hs.11801 


adenosine A2b receptor pseudogene 


0.91 


1.38 


105132 


M159501 


Hs.247280 


HBV associated factor 


1.08 


1.7 


105174 


AA186613 


Hs.34744 


ESTs 


0.95 


2.05 



88 



WO 02/086443 



PCT/US02/12476 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



IU0Z0U 


AA9*5991R 
AAZOZZ 10 


He I4finn 

rl5. I HDUU 


ESTs 


1 


1.4 


Iu0o44 


AAZOOOUO 


He RRA^ 

ns.oo*fo 


to 1 S 


0.72 


2.02 


lUOOlb 


A AOE7Q71 

AAzo/y/i 


Lie 91914 
MS.z ) z 14 


COTe 
to 1 S 


1.35 


3.56 


luObzl 


A A90flQRR 
AAZoUOOO 


MS. 00/ 0 


Mumo Sapiens inrsiNA, cuina ur\rzpooHi\utZi 


1.23 


1.82 


luobyo 


A A OQ7QQQ 

AAzo/oyo 


MS. IOZUZ 


ESTsj Wsskly similar to oliQodGndrocyts- 


0.98 


1.28 


105705 


A A001Y7C7 


U c 1H19R9 
MS. IU IZOZ 


Mumo sapiens mr\iMM, cuina uiNrz-p^OHD i uz ^ii 


0.92 


1.32 


1UO/Z4 


AAzyzuyo 


Uc 99Q^A 

MS.zzyo^ 


PQTc* WonHw etmilar in 7IMP FIWfiFR PROT 

to i s, vveaKiy similar iu z.iimo niNocrx riwj i 


0.99 


1.41 


1 05782 


AAoOUzlO 


Uc 91RPH 
nS.Z IOoU 


to I S 


1 


•j 


1 rtG7QQ 

luo/yy 


AAO/zUlo 


Uc 9A74^ 
nS.Z4/40 


PQTc 
to I S 


1.08 


1.78 


105807 


AAdyooUo 


Uc 1RRRQ 

ns.lboby 


to i s, mouer aieiy similar 10 oullaijcin ali 


0.95 


1.34 


105891 


A &Ar\f\7RQ 

AA4Uu/bo 


Uc 9RRR9 
MS.ZDOOZ 


to is, vveawy similar 10 xumur iicurusis i 


0.87 


2.25 


105936 


AA404338 




cols 


1 14 
i. i*+ 


1 4fi 


lUouoy 


A A>1177/1 
AA4 1 / / 4 1 


U c 9QRQQ 

Ms.zyoyy 


PQTc WpaWlu elmilnr tn 7IWP PIMRFR PROT 
to i S, VvcdMy ollHMdl IU Z.NMO iIINVJCia r r\\J 1 




1 


lUbluo 


A A A1A 1 (\A 

AA4zllU4 


Up 1 Or\QA 

MS. izuy4 


PQTc 
to 1 5 


1.04 


1.44 


IU014U 


A AAOAROA 
AA4Z40Z** 


Uc 14Q19 

ms. i*fy iz 


klAAf!9RR nrntpin 
rxiAAUZOO piuicin 


1.23 


2.11 


iuoi4y 


AA4Z4ool 


Uc 9RR^m 
MS.ZODOU I 


PQTc 
to 1 5 


0.83 


1.48 


A ncA CA 

lUo104 


AA420JU4 


Uc RQOA 

ns.byy4 


PQTc 
to 1 5 


0.77 


2.05 


lfiRiR9 
luo i oz 


AA49RRflQ 
AA4Z00Uy 


Uc 1flRR9 
MS. 1 UODZ 


ESTs 


0.74 


2.23 


luozzu 


AA4Z000Z 


Uc ^91QR 

ms. oz iyo 


tola, IViUUcldlcly binilldl IU 1 1 Icldl y lull 1 p 


0.97 


1.99 


lUbzzo 


AA4zyzyu 


Uc 1771Q 

ms. i / / iy 


PQTc 
to 1 S 


0.99 


1.54 


lUbolo 


A Ay1*3Rf;7n 
AA4000/U 


Uc QfiflR 


nro mRKIA olo nwnnn far»fnp 1 m /OKI/ \~W 

prs-nirMNrt uicdvage iduiui mi ^zokuj 


0.95 


2.09 


IU004 1 


AA4417QR 

rjvm i / y o 


Uc C94Q 
MS.OZnO 


PQTe* Mnrlprfltptu eimilnrfn n!l 9 hunnihp 
lu i a, iviuuci aiciy animal iu pii-^ nypuuic 


0.98 


2.66 


IU040Y 


AA44000U 


Uc 1711R 
MS. 1 / 100 


PQTe 
to 1 S 


0.95 


1.93 


1 (\RA7A 

1U04/4 


AA40UZ 1 Z 


Uc AOARA 
MS.4Z404 


Unmn csnipnc mRKIA' rnWA nkF7n^R4rnR^ ffr 

nuinu sapiens iiiiaina, ouina ur\ri.poo*t\-»uoo 




•j 


1Uo4oo 


AA/li;1R7R 
AA40 10/0 


Uc 3A9QQ 

Ms.ouzyy 


1C2P II mRMA hlnHInn nrnlpin 9 
lor-ll Nlr\INA-UlllUlliy piUlcltl Z 


1.4 


2.29 


fUboyy 


A AvlK79'3R 
AA40/ ZOO 


He 19R4.9 
MS. I ZOfZ 


PQTc MnHpratol\/ cimllar in nrin-fnnnHnn 
tola, iviUUcI dLciy bllf llldl IU IIUII-IUHUUUII 


1 


1.82 


lUbbl 1 


AA40oyU4 


He 9R9R7 
MS.ZOZO/ 


PQTe* \A/qqI/Ii/ clmllfar in trireinA fl-l eanlo 

to i Si vvcdMy similar 10 lursiiiA [n.sdpic 


1.49 


2.78 


!Ubb04 


AA4bU44y 


He ^7RA 
MS.0/04 


PQTc Pinhlw eimllor in nhrtenhneprino om 

to is, niyniy siiiiiidr iu pnuspnusciiim din 


•) 


1.4 


107076 


A ACDQivIC 

AAbUyi40 


He 91143 
MS.Z1140 


PQTc VA/paUlu cimllar in fnc^QR^4 1 TH ea 

to i s, vvedKiy similar iu iosoyoo**_i [n.od 


1.11 


1.49 


1/V71 AR 
III/ 1 10 


a ARin-inn 
AMD I U I uo 


He 97RQ3 

Ms.z/oyo 


PQTc* I— Ifn Htvr cimllar in Pfil-1 94 nrnfpin 
co I s, niyniy oinnidi iu l*oi - iz*t piuicin 


\ 


1.03 


1D719Q 

iu/ izy 


AAR9HRR'} 
AAOZUOOO 


He A75R 
MS.^ff OO 


flan elninfnro-cnor^Iftr* pnrlnniiplpaeP 1 
lldp SUUlslUIU'opcl/UlLr cl IUUI lUUIcdoc I 


1.13 


3.63 


107159 


AAbzl o4U 


Uc mRnn 

MS. IU0UU 


CQTc- Waatflw cimllar in PiRP YkfRflRIn TQ r 

Co i s, vveaKiy similar to ur\r t i\r\uo iu [o.u 


1.05 


2.09 


AI\7AAA 

1U/444 


wzooyi 


Uc R1R1 
MS.O 10 I 


proiuerauoii-dsbuciaicij zo**, ooi\u 


1.18 


1,9 


ACY7ADA 

lu/4ol 


W0OZ4/ 


He 01AX1 

MS.z/ *to/ 


nOlliu bdpicllS MllaSIII SUpclldlllliy niUlUI r\ 


0.99 


2J4 


1P.7K1R 
lUYOlb 


Aoooy / 


He QQRR3 

ns.yyooo 


fShrillarin 
llUllllalin 


0.94 


1.77 


lU/Ozy 


YlZUOO 


He RflQ9 

Ms.ouyz 


ni rr»loAlar nrr\fatn /l*fl*fP/M ronoofl 
nuulculdl piuicin lr\r\C/U repeal/ 


1.05 


2.29 


lU/OOl 


Yioydb 


Uc 1700*3 
MS.l /000 


protein pnospiidiase io ^Turmeny zo/, md 


1.06 


1.62 


lu/oUl 


AAUiy 4oo 


Uc i7^inn 

MS. I / 0 IUU 


PQTc 
to 1 S 


1.03 


14 


lu/yo/ 


AAUO 1 340 


He R7R4R 
MS.O /OHO 


PQTe 
CO 1 s 


0.95 


1.46 


lUoobo 


AAUO004Z 


Uc 1R0R 
MS.10Z0 


a i rasci Vw>a + ^ iranspomny, oaruiac niuooi 


0.59 


1.35 


108780 


A A A 

AAlzoobl 


Uc 1 1 7QQQ 

Ms.n /yoo 


collagsn; type XVII; alpha 1 




7.63 


4 AQQOQ 

108828 


A hA'iAKQA 

AA1o10o4 


Uc 7\A'VZ 
MS. / 1400 


nk'P7PER4r)fi4fi'-l nrnlain 
Ur\rZ.r00 £ tUU*tO0 pruiem 


1.33 


2.56 


109060 


AAA CnQ7Q 

AA1 60879 


Up OvMEKI 

MS.241001 


chloride channel; calcium activated; fam 


0.67 


1.42 


109112 


A A ACQ'37Q 

AAibyovy 


Uc 790RR 

MS. f zobo 


PQTc 
Co IS 


1.03 


2.31 


109344 


A AOA QCQR 

AAzlobab 


Up QRRRQ 

MS.obOOy 


po!y{A)-binding protein-like 1 


0.97 


1.55 


•1/ioyM o 
TUy412 


AAZZ/140 


He 9HQA73 

Ms.zuy^/o 


PQTe- WpaWv eimilar In RP-T5I II ATOR OF MIT 
co i s, vveaKiy oimnar iu r\cv3ULr\ t \jr\ ur ivii i 


0.76 


1.87 


"1 H mart 
110780 


N23174 


Uc OODQ1 
MS.2Zbyi 


solute carrier family 7 (cationic amino 


0.9 


0.95 


nuyoo 


N0U00U 


He DAZSV7 
MS.Z400/ 


einnal fra ncH l lf>(ir»n nmtoin /QUI pnntain 

siynai udiiouubiiun proicin [ono uuniaii i 


1.17 


2.26 


4 4 4f\4 Q 

11 1018 


N54067 


Uc QROO 

mS.ooZo 


mitogen-activated protein kinase kinase 


1.21 


1.85 


1 1 1 QQ7 
11 100/ 


N/yoiz 


He 1RR(17 
MS. IOOU/ 


PQTc* Hinhlu eimllor in Mi/neln hpav/u rha 
to I o, niyniy oiiniidr iu iviyubin iicdvy uiid 




1.45 


1 1 2305 


Ko4ozz 


Uc ORIAA 
MS.ZbZ44 


PQTe 
CO IS 


1 


•) 


11Z4U1 


Kbiz/y 


He 937R3R 
MS.ZO/OOO 


PQTe- Wpaklv eimllarin P9<^RS ? \C p\pna 
to 1 s, vvedMy bimnar iu rzuDu.o [o.eicyd 


1.24 


1.64 


1 1 ZOOd 


1 UZo4o 


He AIRi 
MS.hOO 1 


PQT 
to i 


1.56 


1.96 


1 A OQCQ 

1 1 2869 


lUoolo 


Uc A7A7 

nS.4/4/ 


aysKeraiosis conyeniia i,uysKenn 


1.03 


1.57 


1 10QQ9 

uzyyz 


1 ZOO 10 


He 7147 
MS./ IH/ 


ESTs 


1 


1 


1 1 *3n4R 
1 1 0U40 


i zooyo 


He IRdfitlR 
MS. 1 0*tUUO 


PQTe* Wpaklu elmilarln RMA-hlnHinn nrnt 
CO I o, VVCat\iy ollilllal IU r\M/A Ull lull ly ptui 


1.37 


2.26 


1 A Qnci 
lloUbo 


1 0Z4OO 


He Rfl97 
MS.OUZ/ 


PQTe 
CO I s 


1 


1 


1 1 Q17Q 

noi i y 


TCR1 DO 
I0010Z 


He 1^9^71 
MS. IOZO/ I 


PQTe* Minhlv eimilar in IfiF.II mRMA-hinrl 
co I b, niyiuy bimiidi iu lur-n nirxiNrt-uiuu 


1.33 


2.7 


113573 


TQ1 1RR 

lyiibb 


He IRQQfi 

ms. i oyyu 


PQTc 
CO 1 s 


0.76 


147 


noun 


W44yZo 


He 4R7R 
M5.H0/0 


PQTe 
CO 1 S 


0.79 


1.51 


11/inQR 

114Uob 


z.ooZt)b 


He 19770 
MS. IZ//U 


HniYin eanlpne PAP f»lnnp H Ifi777f")9^ frnm 7n 
nUII IU bdpicllb ~AL/ UlUllc UJU f 1 1 \Jd-\J ilUIII / p 


0.9 


1.34 


11>1CQ7 

11400/ 


A An7OQ07 

AAU/Uoz/ 


He iflm9n 

MS. IOUOZU 


PQTc* Wpaklu elmllar in fifll C\ 4-TRANQMFM 
co i s, vveaiviy similar 10 oului h- i r\AiNoivicivi 


1.02 


176 


1 AAQAR 

1 1 4o4b 


AAZ04yzy 


He AAIAI 
MS.HH040 


PQTc 
CO i s 


1.32 


2.36 


114964 


AAz4oo/o 


Uc QOIOvl 

nS.£3Zlo4 


ring finger protein 3 


1.1 


1.84 


115047 


AAzOZbz/ 


Uc 99 KG A 

nS.ZZ004 


homeo box B5 


1.01 


2.36 


ilolbb 


A AOCQ/HQ 

AAZ0o4Uy 


He 1QRQH7 

ms. lyoyu/ 


mud in nrrtl^ln Tnm 111/ a 1 

myelin protein zero-iiKe i 


1.05 


2.31 


115167 


A AOCD/H 

AAz0o4z1 


Uc A^toa 

MS.40/ZO 


hypothetical protein 


1.52 


2.52 


i i ozoy 


A A07JJRR0 
AAZ/OOOU 


He 739Q1 

Ms./ozy i 


PQTe* WoaHu cimllar in cimllar in Ihp h 
co i s, vvcdMy similar iu binuidi iu me u 


0.7 


2.57 


1 1 5278 


A A07Q7R7 

AAZ/y/0/ 


He R7./1RR 
M5.0/400 


PQTc* WpaHu cimllar in RAPN^fil 1 rf \D m 

co i s, vveaKiy similar 10 dav^inozoi i.u [u.iii 


1.14 


2.12 


1 A CCCO 

lloboz 


AA4U0UytJ 


Uc 3R17R 
MS. 00 1 10 


PQTc 
CO 1 5 


0.82 


4.67 


1 1 5875 


AA4ooy4o 


Uc A"iCiAR 

ns.4oy4b 


co i s, vveaKiy similar to vveaK sirniidniy 


1.2 


198 


4 4 Cf\f\A 

1 1 bUU4 


AA44yi Zz 


Uc 7RHOR 

ns./ouoo 


CQTc» UlinnN/ ciivill^f tr\ cnnall "7irin finrto 

to i s, niyniy simitar 10 smaii ziiiif imye 


0.96 


1.31 


llblzl 


AA40yz04 


nS.40000 


PQTc 
CO 1 S 


0.97 


1.55 


iibizy 


A A/4RQQCR 

AA4oyyob 


Uc AQARI 

ns.4y ioo 


PQTc* Hinhtu cimllar in nnlafiv/P nhnnilf 

co is, niyniy similar iu puiduve iiuunuo 


1.08 


2.73 


nbiyu 


A AXRylQRQ 
AA404y00 


He R777R 
MS. Or / /O 


PQTc 
CO 1 S 


0.8 


1.57 


116312 


AA4yU4y4 


Uc RCytrtQ 
MS.b04Uo 


PQTc 
CO 1 S 


1.37 


2.65 


116732 


F13779 


Up 4CGOnQ 

Ms.iooyuy 


CCTc 
to IS 


0.92 


1.8 


117602 


NooUzU 


Uc AARQR 

HS.44bbO 


PQTc \Mai\s\u eimllor in f2(~\\ IATH PRDTPIM 

to i s, vveaKiy similar 10 xjulia i n rr\u i chn 


1 is 


1.84 


1 1 /you 


llJ 1 OSH 


He 7R47R 


KIAA0956 protein 


1.04 


2. 36 


117992 


N52000 


Hs.172089 


Homo sapiens mRNA; cDNA DKFZp586B0222 (f 


0.62 


1.29 


118785 


N75386 


Hs.111867 


GLI-Kruppel family member GLI2 


1 


1 


119717 


W69134 


Hs,57987 


ESTs 


1 


1.4 


119814 


W74069 


Hs.58350 


ESTs 


0.78 


177 


120128 


Z38499 


Hs.91448 


MKP-1 like protein tyrosine phosphatase 


0.86 


1.46 


120242 


Z98443 


Hs.86366 


ESTs 


0.83 


2.01 



89 



WO 02/086443 



PCT/US02/12476 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



80 



85 



120483 


a a ncinn/i 

AA252994 


II. 4E7Q 

HS.Ib/o 


121054 


AA398604 


I |_ 0700"? 

Hs, 97387 


121326 


AA404246 


II. O7fl04 

HS.y/Uol 


121376 


a a jnccnn 

AA405699 


n. 4cco , ao 
HS.IbbZoz 


121457 


AA411448 


I I- nnonoc 
HS.2U8985 


121780 


a a yionnoc 

AA422085 


HS.lz4bbu 


121781 


a a jih4 cn 

AA422150 


Up QQ07n 

HS.ybo/U 


121844 


A Av1OE7Q0 

AA4Zb/oz 


Up QQylQC 

Hs.ytwob 


4 linen 

122059 


A A j(04707 

AA431737 


HS. 98/49 


A OOOOQ 

122338 


A A >MQ04 1 

AA44J31 1 


Up QQQQQ 

ns.yoyyo 


122354 


A A A JI0770 

AA44o/ 11 


Up IQCRQO 

Hs.ioooyz 


4 nocru 

122591 


AA45o2bb 


Up O.GQ4 4 

Hs.yyji i 


4 nmnn 

122790 


A a ^cni EC 
AA4bUlbb 


Up OOCCC 

Hs.yyoob 


4 noono 

123398 


A A E04 OGE 

AAozl zbb 


Up -incziA 
HS.lUobl4 


123518 


a a en o e o 4 

AA608531 


Up 47HQ4 O 
HS. 1/0313 


123673 


a a cr\r\k~iA 

AA609471 


Up 1 4 074 O 
HS.112/12 


A niinnn 
124000 


D57317 


Up "7AQR4 
HS./400l 


124367 


N24006 


Up QQOXD 

Hs.yyo4o 


124447 


N48000 


Up A AftCtAc 

HS. 140945 


125756 


W25498 


Hs.81634 


125769 


A1382972 


Up Q041Q 

HS. 82128 


125852 


H09290 


Hs.76550 


125924 


AA526849 


Hs.82109 


126037 


M85772 


Hs.6066 


126214 


N29455 


Up ~7A 01 C 

HS./4olb 


126414 


N78770 


Up OOOVIOO 

HS.223439 


126737 


A A AOOA OO 

AA488132 


Hs. 62741 


126743 


A A A 700CJ 

AA1 79253 


Up 4 704QO 

HS. 172182 


126926 


A A A~1(\C AC 

AA 179546 


Up QOO 

HS.832 


1 27432 


A a enATOA 

AA501734 


Up 4 7/V3 4 4 

HS.l /Uoll 


4 noon Q 

128218 


H02682 


Up qq 1 on 

Hs.yyiby 


1 28527 


MO 4 COO 


Up 4C\M\A~I 

HS.1U1U4/ 


128568 


Xb067o 


Up 0/17CCQ 

HS.Z4/5b8 


128584 


M11433 


Up 4 f\4 ocn 

HS.101850 


128628 


C1 4037 


Up OC4 (Y7Q 

HS. 251 978 


128691 


W27939 


Up 4flOOOjl 

Hs. 103834 


128714 


V00599 


Up 4 70CC4 

Hs.179661 


128733 


a a ononno 

AA328993 


Up 4ft/tCCQ 

HS. 104558 


•1 0 070 *< 

128781 


X85372 


Up 4flCylCE 
Hs. 105465 


4 nnnci 

129052 


a a >iricon*7 

AA496297 


Up 4 Q07./1fl 

HS. 182740 


a nnnnc 

129095 


L12350 


Up 4 0.QC00 

HS.1Uob23 


129241 


A A AOCCCC 

AA435665 


Up 4nD7/>C 

HS. 109706 


129665 


M88458 


Up 4 4 Q77Q 
HS.llO/ lO 


129703 


A A Jin.4 0/1 Q 

AA401348 


Up 4 7QQQQ 

Hs.i /yyyy 


4 oivzon 
129720 


A A /7CCQO 

AA4/bboz 


Up 401CO 
HS.lZlOiC 


a onocn 

129850 


N2Q593 


Up ECQjIC 

HS.5b845 


129896 


a a nAnniA 

AA043021 


Up 4 OOOE 

Hs. 13225 


130069 


AA055896 


Hs. 146428 


130405 


H88359 


Up A ECODC 

Hs. 155396 


130541 


vnceAO 

X05608 


Up 04 4EQ4 

Hs.211584 


130599 


M91670 


■ i„ a~i A rx~tr\ 

Hs. 174070 


130867 


J04093 


Hs.2056 


131009 


AA063596 


Hs. 22142 


131028 


U20240 


Up 0007 

Hs.2227 


131083 


U66661 


■ nmoc 

Hs. 22785 


131091 


T35341 


Up OOQQfl 

Hs. 22880 


131144 


C14412 


Up OOCOQ 

HS.23528 


131148 


000038 


Up O0E7Q 

HS.235/9 


131164 


Y00503 


Up 4 QOOCE 

HS.1822bb 


131185 


M25753 


Hs.23960 


ini o4 n 

131219 


C00476 


Up Oiionc 
HS.24395 


131454 


A A iJCEQnC 

AA455895 


Up OCOO 

HS.2b99 


131687 


L11066 


Up OACn 

Hs.3069 


131689 


AA599653 


Hs.30696 


131692 


D50914 


Hs. 30736 


131786 


a A A once >t 

AA1 35554 


Up 001 OE 

Hs.32125 


131843 


AA1 95893 


Up 4 Q^neo 

Hs.1 84062 


A 1 A DCA 

131860 


U02082 


Hs.334 


131884 


H90124 


Hs.3463 


131903 


A A >I04 "700 

AA481723 


Hs.3436 


131945 


M87339 


1 1„ oc4on 

Hs. 35120 


131958 


a Annonno 

AA093998 


Up occe 

Hs.3566 


131964 


W42508 


Up ocno 

Hs.3593 


132001 


J 00277 


Up 07nno 
HS.J/UU3 


132040 


A A A ACQ 

AA1 46843 


Up 4 70QQj4 

HS.l Z2894 


^ inner 

132065 


nooooc 


Up 04 4 EQyf 

HS.zllby4 


A 004 AO 

132109 


a a cnnon-i 


Up yinnoQ 
HS.4UU90 


132112 


A A A Cf\CGA 

AA1 50661 


Up ,404C/I 

HS.4U154 


1Q91 9Q 
luZ I ZO 


AAAA719Q 
MAHH/ I/O 




132162 


H89551 


Hs.41241 


132180 


M405569 


Hs.418 


132309 


AA460917 


Hs.2780 


132371 


AA235448 


Hs.46677 


132618 


AA253330 


Hs.5344 


132736 


U68019 


Hs.211578 



apoptosis inhibitor 4 (survivin) 
ESTs 

ESTs; Weakly similar to Similar to phyto 
ESTs; Moderately similar to SODIUM- AND 
ESTs 
ESTs 

cytochrome P540 family member predicted 

gap junction protein; beta 2; 26kD (conn 

EST 

ESTs 

ESTs 

ESTs; Weakly similar to MRJ [H.sapiens] 

ESTs 

ESTs 

ESTs 

ESTs 

activated RNA polymerase II transcriptio 
distal-less homeo box 5 

Homo sapiens mRNA; cDNA DKFZp586L141 {fr 

ATP synthase; H+ transporting; mitochond 

5T4 oncofetal trophoblast glycoprotein 

Homo sapiens mRNA; cDNA DKFZp564B1264 (f 

syndecan 1 

KIAA1 112 protein 

desmoplakin (DPI; DPII) 

ESTs 

ESTs 

poly{A)-binding protein; cytoplasmic 1 
ESTs; Highly similar to INTEGRIN BETA-8 
heterogeneous nuclear ribonucleoprotein 
ESTs; Moderately similar to recombinatio 
transcription factor 3 (E2A immunoglobul 
adenylate kinase 3 
retinol-binding protein 1; cellular 
EST 
ESTs 

Homo sapiens clone 24703 beta-tubulin mR 
ESTs 

small nuclear ribonucleoprotein polypept 
ribosomal protein S11 
thrombospondin 2 

ESTs; Moderately similar to HN1 [M.muscu 
KDEL (Lys-Asp-Glu-Leu) endoplasmic relic 
ESTs 

ESTs; Moderately similar to SIGNAL RECOG 

GDP dissociation inhibitor 2 

UDP-Gal:betaGlcNAc beta 1;4- galactosylt 

collagen; type V; alpha 1 

nuclear factor (erythroid-derived 2)-Iik 

neurofilament; light polypeptide (68kD) 

Ubiquitin carrier protein 

UDP glycosyltransferase 1 

ESTs; Weakiy similar to NADH-CYTOCHROME 

CCAAT/enhancer binding protein (C/EBP); 

gamma-aminobutyric acid (GABA) A recepto 

ESTs; Highly similar to dipeptidyl pepti 

ESTs; Highly similar to HSPC038 protein 

ESTs 

keratin 19 

cyclin B1 

small inducible cytokine subfamily B {Cy 
glyptcan 1 

heat shock 70kD protein 9B (mortaiin-2) 
transcription factor-like 5 (basic helix 
KIAA01 24 protein 
ESTs 

ESTs; Moderately similar to putative Rab 

Oncogene TIM 

ribosomal protein S23 

deleted in oral cancer (mouse; homolog) 

replication factor C (activator 1) 4 (37 

ESTs; Highly similar to phosphorylation 

ESTs 

v-Ha-ras Harvey rat sarcoma viral oncoge 
BH3 interacting domain death agonist 
proteasome (prosome; macropain) 26S subu 
ESTs 

jumonji (mouse) homolog 

ESTs 

ESTs 

fibroblast activation protein; alpha; se 

jun D proto-oncogene 

ESTs 

adaptor-related protein complex 1; gamma 
MAD (mothers against decapentaplegic; Dr 



n 7 a 


1.64 


I.U3 


i.yo 


u.yu 


1 Q 

I.O 


u.y i 


1 RQ 

J.OO 


n qi 
u.y i 


1 R.Q 
i.oy 


U.HO 


n R5 
u.oo 


1 07 


1 RA 

J. OH 




1 4 


i.tto 


9 QQ 


4 
1 


1 


u.oo 


1 

i.oy 




2.93 


n rr 
u.oo 


1 Q 

I.O 


4 
I 


1 Q^ 
i.yo 


4 
1 


4 


4 
1 


1 1R 
1. 10 


n 7 a 

U./*t 


1 19 

1. IZ 


U.O/ 


1 1 

i . i 




1 7 
I./ 


n qo 


1 RQ 

i.oy 


l.bb 


R 7fi 
O. /o 




9 9fi 
Z.ZO 




9 9R 
Z.ZO 


1. 00 


1 RQ 
1.00 


i.yo 


Q RR 
O.00 


1 91 


1 RR 
I.00 


4 
1 


4 
1 


1.0 


2.16 


9 

A. OO 


9 fl 
z.o 


1 57 
1.0/ 


9 19 
z, tz 


1 94 


2.09 


1 nn 

I.UO 


1 7ft 
l./O 


1 9Q 
I.ZO 


Q 4R 


n R7 

U.O/ 


9 49 


1 99 


1 Q 
i.y 


I. I 


1 7Q 

l./O 


n Q9 
u.yz 


1 17 
I.I/ 


1 1A 


1 Q4 
1.84 


n o 
u.y 


1 ^A 
I, OH 


9 


3.19 


1 OA 
I.Uf 


3.2 


U.aO 


1 R1 

I.O 1 


1 9R 


9 RQ 
Z.00 


n Q7 
u.y/ 


1.63 


1 no 
i.uy 


1 7Q 
i./y 


n 7 A 


1 RR 
I.00 


1 At 


4 1Q 

h. iy 


1 17 
1. > / 


1.98 


1 9fi 
I.ZO 


1 7Q 
i./y 


4 
I 


i 


1 f!7 
l.U/ 


1 RR 
I.00 


4 
I 


A fl 
H.O 


u.yo 


1 DR 
I.UO 


4 

I 


1 9Q 
I.ZO 


1 1 
1. 1 


1 ft 
1 .0 


1 9ft 
I.ZO 


1 QR 
i.yo 


1.40 


9 flfi 

z.uo 


n ftp 
u.oo 


Q QR 
0.00 


1 1Q 

i. iy 


9 77 
z. // 


u.oo 


Q Rd. 
O.OH 


n RR 
u.oo 


9 QR 


u.yy 


1 R4 

I. OH 


i 


1 1R 

1. IO 


4 
1 


1 QR 

i.yo 


1 R5 
1. 00 


9 QQ 
z.oy 


4 
I 


1 QQ 
I .oo 


n ft*4 
u.oo 


1 fi^ 

I.OO 


1 OR 
I.UO 


2.2 


1 OQ 
I.ZO 


1 OA 


n qi 
u.y 1 


1 1R 

1. IO 


4 

i 


9 R 
Z.0 


n R7 

U.O/ 


1 QR 
I.OO 


4 

I 


1 9R 

I.ZO 


1 19 


1 AQ 

I. HO 


4 

1 


1 5R 
I.OO 


n rq 
u.oy 


1 97 


A 
1 


1 nR 

I.UO 


n qq 
u.yy 


1 44 

I.HH 


1.06 


2.46 


108 


Z46 


1.02 


4.56 


1.16 


1.8 


0.8 


1.26 


0.5 


1.49 


1.21 


1.81 



90 



WO 02/086443 



PCT/US02/12476 



132771 


A A AOOA OO 

AA4oo4d2 


nS.ob4U/ 


132833 


U78525 


Hs.57783 


132922 


T23641 


HS.bObb 


132959 


AA0281 03 


Hs.61472 


132994 


A A Cf\Ci OO 

AAoUbl oo 


Hs.7594 


A >JOAf\C 

133005 


rirti IAA 

C21400 


HS.lUoo<£9 


133065 


X62535 


MS.1 72690 


133083 


N70633 


Hs.6456 


133086 


L17131 


Hs.1 39800 


133134 


189703 


HS.65o4o 


133195 


AA350744 


Ur% to*. Ana 
HS.1o14Uy 


133313 


AA249427 


Hs.70704 


133331 


T62039 


u<» -i COC7C 
nS.IObbfb 


1334oo 


Dloo/U 


Ulr. 7O7O0 


133445 


T99o0o 


U» 707Q7 


1do4oo 


XoZ4zb 


liS./4U/U 


133492 


L40397 


HS.f41oY 


133504 


W90U/U 


nS. /4olb 


133517 


X52947 


nS./44/1 


133540 


D78151 


Lin -7AG4.0 

HS./4bi9 


133594 


L07758 


HS.1 fZoov 


133627 


U09587 


Hs.75280 


133671 


T25747 


Hs.75471 


133859 


U86782 


Hs. 178761 


133865 


F09315 


Hs.l70z9U 


133913 


I ft in A ~7A\ o 

W84712 


Hs.7753 


133963 


L34587 


Hs. 184693 


133982 


U47621 


Hs. 207251 


134100 


L07540 


M. 4 7-1 A7C 

HS.1 nOYo 


1341 10 


i )a mem 
U41UbU 


Hs.r yioo 


134158 


U15174 


rfS.7y4zo 


134161 


U97188 


HS.7944U 


134193 


F09570 


nS./9oU 


134367 


X54199 


Hs.82285 


134402 


U25165 


II. Q17-I O 

Hs.82712 


134457 


D86963 


HS. 174044 


134469 


X17567 


Hs.83753 


134498 


ijco<i OA 

M63180 


Hs. 84131 


134501 


W84870 


Hs.211obo 


134507 


M63488 


t_i OVlO't o 

Hs. 84318 


134548 


U41515 


Hs.85215 


134599 


X99226 


Hs.86297 


134692 


R73567 


Hs,8850 


134693 


ft.1^flfl&4 

N70361 


Hs.8854 


134806 


Z49099 


Hs.89718 


J 040V1 


7*3 vf 0.7.4 


ns. lyoooz 


134864 


Y08999 


Hs.90370 


134914 


U29615 


Hs.91093 


134953 


L10678 


Hs.91747 


134993 


M282343 


Hs.9242 


135051 


C15324 


Hs.93668 


135158 


U51711 





phosphoserine phosphatase 1 

eukaryotic translation initiation factor 0.91 

KIAA11 12 protein 1.16 

ESTs; Weakly similar to unknown [S.cerev 1 .02 

solute carrier family 2 (facilitated glu 0.72 

KIAA0970 protein 0.88 

diacylglycerol kinase; alpha (80kD) 0.93 

chaperonin containing TCP1; subunit 2 (b 1.14 

high-mobility group (nonhistone chromoso 0.97 

RNA binding motif protein 8 1.1 

KIAA1007 protein 2.29 

ESTs 107 

ribosomal protein L14 0.85 

APEX nuclease (multifunctional DNA repai 0. 91 

guanine nucleotide binding protein (G pr 0.94 

keratin 13 0.85 

transmembrane trafficking protein 1.1 

desmopJakin (DP); DPII) 0.7 

gap junction protein; alpha 1; 43kD (con 0,95 

proteasome (prosome; macropain) 26S subu 0.91 

nuclear phosphoprotein similar to S. cer 0.84 

glycyl-tRNA synthetase 1 .09 

zinc finger protein 146 1.02 

26S proteasome-associated pad1 homolog 1.11 

discs; large (Drosophila) homolog 5 1 .84 

calumenin 1.15 

transcription elongation factor B (Sill) 1 .3 

nucleolar autoantigen (55kD) similar to 1 .3 

replication factor C (activator 1) 5 (36 0.72 

L1V-1 protein; estrogen regulated 1.04 

BCL2/adenovirus E1B 19kD-interacting pro 1 

IGF-ll mRNA-binding protein 3 0.82 

ESTs 0.98 

phosphoribosylglycinamide formyltransfer 1 

fragile X mental retardation; autosomal 1 .26 

dishevelled 3 (homologous to Drosophila 1 

small nuclear ribonucleoprotein polypept 0.94 

threonyl-tRNA synthetase 1 .2 

eukaryotic translation initiation factor 0.84 

replication protein A1 (70kD) 1 .7 

Deleted in split-hand/split-foot 1 regio 1 .46 

Fanconi anemia; complementation group A 1.36 

a disintegrin and metalloproteinase doma 0.77 

ESTs 1.09 

spermine synthase 0.98 

plakophilin 1 (ectodermal dysplasia/skin 0.99 

actin related protein 2/3 complex; subun 0.95 

chitinase 1 (chitotriosidase) 1.16 

profilin 2 0.95 

purine-rich element binding protein B 0.98 

ESTs 1.35 

Human desmocollin-2 mRNA; 3' UTR 0.86 



1.3 

1.43 

1.53 

1.88 

2.97 

1.34 

1.23 

1.76 

1.43 

1.8 

2.69 

1.68 

1.18 

1.45 

1.68 

1.14 

1.69 

6.21 

1.3 

1.25 

1.29 

1.99 

1.5 

3.33 

6.7 

1.86 

1.91 

1.99 

1.65 

162 

1.55 

1.95 

1.48 

2.8 

2 

1.47 

1.57 

2.64 

1.36 

2.93 

2.73 

2.22 

1.64 

1.82 

1.35 

1.4 

1.42 

1.29 

1.76 

1.73 

2.11 

1.16 



Table 1B shows the accession numbers for those pkeys in Table 1A lacking unigenelD's. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on' sequence 
similarity using Clustering and Alignment Tools (Doub)eTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
Accession column. 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT Accessions 

100661 23182J BE623001 L05096 M383604 AW966416 N53295 AA460213 AW571519 AA603655 

100667 26401 3 L05424 X56794 S66400 X55150 W60071 AW351820 X55938 M83326 BE005289 BE070059 M83324 BE005248 BE069717 BE181648 BE069700 

AW606203 BE069721 AW382138 AW803776 BE463954 BE005334 BE005274 T27386 AA932714 AA972695 AW377728 AI632506 T29066 
AI783934 AW377727 BE163715 AL047291 AA279047 AA523003 BE008048 BE440141 W23614 BE090519 BE092193 N29181 N20358 N44153 
BE546944 T69231 AW377441 AA907406 H50799 AW051416 AI420712 BE620922 A1279161 AA992549 W47198 BE005241 AI342696 H50700 
AI969974 AI863855 AA374490 AW1 30675 Al 950633 AA146687 H99482 X55150 BE005414 BE005339 N28294 AI673068 AI887890 AW804171 
AI675961 AW804172 AA778841 AL048050 A1127757 AI095568 AW204965 AW468978 W31 898 AI052595 AI278771 BE46401 8 A1081 503 AI824196 
AA513211 AA411062AW084376 N48752 AA703209 N35580 AW059918M054563 Al 280942 T276 19 BE621435 N66010 AW589527 AI160414 
AA283090 M962536 H82726 W521 15 W45432 W60433 AA577548 AA146714 BE150994 M054615 AW796025 AW382768 BE565671 C00444 
AA054555 

100668 26401_3 L05424 X56794 S66400 X55150 W60071 AW351820 X55938 M83326 BE005289 BE070059 M83324 BE005248 BE069717 BE181648 BE069700 

AW606203 BE069721 AW382138 AW803776 BE463954 BE005334 BE005274 T27386 AA932714 AA972695 AW377728 A1632506 T29066 
A1783934 AW377727 BE163715 AL047291 AA279047 M523003 BE008048 BE440141 W23614 BE090519 BE092193 N29181 N20358 N44153 
BE546944 T69231 AW377441 AA907406 H50799 AW051416 AI420712 BE620922 AI279161 AA992549 W47198 BE005241 AI342696 H50700 
AI969974 AI863855 AA374490 AW130675 A1950633 AA146687 H99482 X55150 BE005414 BE005339 N28294 A1673068 AI887890 AW804171 
AI675961 AW804172AA776841 AL048050 Al 127757 AI095568 AW204965 AW468978 W31898 AI052595 AI278771 BE46401 8 AI08 1503 AI824 196 
AA51321 1 AA411062 AW084376 N48752 AA703209 N35580 AW059918 AA054563 AI280942 T27619 BE621435 N66010 AW589527 AI160414 
AA283090 AA962536 H82726 W521 15 W45432 W60433 AA577548 AA146714 BE150994 AA054615 AW796025 AW382768 BE565671 C00444 
AA054555 

101332 25130J J04088 NM_001067 AF071747 AJ011741 N85424 AL042407 M218572 BE296748 BE083981 AL040877 AW499918 AW675045 H17813 
BE081283 AA670403 AW504327 BE094229 AA1 04024 AI471482 AI970337 AA73761 6 AI827444 AW003286 A1742333 AI344044 AI765634 
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Table 2A shows 504 genes down-regulated in lung tumors relative to norma! lung and chronically diseased lung. Chronically diseased lung samples represent chronic non- 
malignant lung diseases such as fibrosis, emphysema, and bronchitis. These genes were selected from 59680 probesets on the Eos/Affymetrix Hu03 Genechip array. Gene 
expression data for each probeset obtained from this analysis was expressed as average intensity (Al), a normalized value reflecting the relative level of mRNA expression. 
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34.80 
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38.80 
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11.00 



3.19 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



121835 
121841 
121885 
121888 
121938 
121950 
122030 
122054 
122211 
122233 
122247 
122253 
122266 
122285 
122409 
122485 
122697 
122772 
122831 
122913 
123049 
123076 
123136 
123309 
123455 
123691 
123756 
123802 
123837 
123844 
123936 
123987 
124013 
124160 
124205 
124226 
124246 
124348 
124358 
124409 
124442 
124468 
124479 
124519 
124711 
124866 
124874 
125097 
125179 
125200 
125299 
125400 
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126303 
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126507 
126773 
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127462 
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127572 
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128073 
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128212 
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128364 
128426 
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128773 
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interleukin 7 receptor 2.29 
ESTs 

Human cytochrome P450-IIB (hliB3) mRNA; 
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protein tyrosine phosphatase; non-recept 
ESTs 

angiotensin receptor 1B 

carbonic anhydrase IV 

angiopoietin 1 receptor; TEK tyrosine ki 

thyroid transcription factor 1 

ficolin (collagen/fibrinogen domain-cont 

ESTs 

ESTs 

RNA binding motif protein 6 
ESTs 
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TABLE 2B shows the accession numbers for those primekeys lacking unigenelD's for Table 2A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession; Genbank accession numbers 



Pkey 

108447 
108550 
108655 
102397 
126303 
125810 
103627 
121366 
114609 
115272 
108338 
108434 
123802 
102310 
102636 
104776 
120504 
113502 
108499 
101308 
108629 
103098 
103241 
103508 
103575 
119514 
121082 
128634 
105817 
121518 
114449 
114648 
121950 
107723 



CAT number Accessions 

43452_-7 AA079126 

1 20073 J M084867 AA084996 

127522J AA099960AA113013 

44371.-1 U41898 

1525933J D78841 D78880 

1554054.1 H00083R81062 

2615.2 Z48513Z48512 

280401 J AI74351 5 AA40561 7 AW276706 

1 1 6777J AA079505 AA079537 

172113J AW015947 AA21 1890 M279425 

1 1 21 86 J AA070773 AA070774 

1 1401 2 J AA078899 AA078782 AA075788 

genbank.AA620448 AA620448 

NOT_FOUND_entrez_U33839 U33839 

entrez_U67092 U67092 

genbanleAA026349 AA026349 

genbank_AA256837 AA256837 

genbankJT89130T89130 

genbank_AA083103 M083103 

entrez_L41390 L41390 

genbank_AA102425 M102425 

221.215 M86361 Z26593 X02850 D13070 AE000659 M17649 M87869 M87871 X61077 M16286 AF018169 X61079 S59351 X60142 AF043169 

entrez_X76223 X76223 

entrez_Y10141 Y10141 

entrez_Z26256 Z26256 

NOT_FOUND_entrez_W37937 W37937 

genbank_AA398722 M398722 

AA464918_at AA464918 

genbank_AA397825 AA397825 

genbantcAA412155 M412155 

genbanleAA020736 AA020736 

genbank„AA101056 M101056 

genbank„AA4295l 5 M42951 5 

genbanleAAOl 5967 M01 5967 
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Table 3A shows 452 genes up-regulated In chronically diseased lung relative to normal lung. Chronically diseased lung samples represent chronic non-malignant lung diseases 
such as fibrosis, emphysema, and bronchitis. These genes were selected from 59680 probesets on the Eos/Affymetrix Hu03 Genechip array. Gene expression data for each 
probeset obtained from this analysis was expressed as average intensity (Al), a normalized value reflecting the relative level of mRNA expression. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: 80th percentile of Al for chronically diseased lung samples divided by the 90th percentile of Al for norma) lung samples. 

R2: 80th percentile of Al for chronically diseased lung samples divided by the 90th percentile of norma! lung samples, squamous cell carcinomas and 
adenocarcinomas 

R3: 70th percentile of Al for chronically diseased lung samples minus the 15th percentile of Al for all normal lung, chronically diseased lung and tumor samples 



divided by the 90th percentile of normal lung samples, squamous cell carcinomas and adenocarcinomas minus the 15th percentile of Al for all normal lung, 
chronically diseased lung and tumor samples 
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11.40 
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35.80 
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23.20 
15.20 
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10.71 
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27.20 
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12.00 
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17.20 
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6.00 
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3.60 

11.40 
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10.20 
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2.05 



2.40 
1.78 
1.76 
2.19 



1.94 
1.75 
2.47 



1.92 



1.87 
1.93 



1.91 



1.80 
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MS.aby 
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BE245294 
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MS.l / /48b 


100716 
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egf-like module containing, mucin-like, 
cadherin 5, type 2, VE-cadherin (vascula 
UM domain only 2 (rhombotin-like 1) 
protein tyrosine phosphatase, receptor t 
progastricsin (pepsinogen C) 

CUG triplet repeat, RNA-binding protein 1 1 .00 

CDP-diacylglycero! synthase (phosphatida 25.40 

signal transducing adaptor molecule (SH3 14.00 

amine oxidase, copper containing 3 {vase 

protein kinase C-like 2 10.86 

guanine nucleotide binding protein 11 

chemokine (C-X3-C) receptor 1 

steroidogenic acute regulatory protein 16.40 

spleen tyrosine kinase 15.40 

mannose receptor, C type 1 

myeloid ceil nuclear differentiation ant 

S100 calcium-binding protein A4 (calcium 

tachykinin, precursor 1 (substance K, su 18.80 

complement component 5 receptor 1 (C5a I 

gb:Human alpha satellite and satellite 3 504.80 

coagulation factor VIII, procoagulantco 

hydroxyprostaglandin dehydrogenase 15-(N 

calcitonin receptor-tike 

FBJ murine osteosarcoma viral oncogene h 

enhancer of fomentation 1 (cas-likedo 

microfibrillar-associated protein 4 

gb:Human dystrophin (dpi 40) mRNA, 5' end 19.00 

G protein-coupled receptor kinase 5 

transforming growth factor, beta recepto 

solute carrier family 6 (neurotransmitte 

Charot-Leyden crystal protein 19.38 

fatty acid binding protein 4, adipocyte 

S164 protein 15.40 

amyloid beta (A4) precursor protein (pro 1 1 .20 

HIR (histone cell cycle regulation defec 14.80 

gb:Human nonmuscle myosin heavy chain-B 33.00 

KIAA0237 gene product 16.20 

sre homology three (SH3) and cysteine ri 

Down syndrome critical region gene 1-lik 

growth differentiation factor 10 
macrophage scavenger receptor 1 
hyaluronoglucosaminidase 2 
myocilin, trabecular meshwork inducible 



1.76 
2.15 



7.40 



31.00 



7.52 



1.78 
2.22 

1.75 
2.24 

2.01 

1.91 



4.00 
4.24 
6.20 
21.20 



5.40 



1.79 



TABLE 3B shows the accession numbers for those primekeys lacking unigenelD's for Table 3A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 



Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT number Accessions 



123619 
126433 
125831 
126816 
126852 
121059 
120637 
122011 
120934 
123802 
116814 
118329 
104404 
104776 
113502 
101262 
108573 
101447 
124357 
108781 
112794 
100351 
100555 



371681.1 AA602964 AA609200 
127143 1 AA325606AA099517N89423 
1522905.1 H04043 D60988 D60337 
122973 1 AA248234AA090985 
136135.1 AA399961 AA128347 
273450J AA393283 AA398628 
200885.1 AA81 1 804 AA809404 AA286907 AW977624 
7617.-2 AA431082 
177521.1 AA226198AA226513AA383773 
AA620448 
H50834 
N63520 



genbank.AA620448 
genbank_H50834 
genbank_N63520 
H58762_at H58762 
genbank_AA026349 
genbank_T89130T89130 
entrez_L35854 L35854 
genbank_AA086005 
entrezJ/121305 M21305 
genbank_N22401 
genbankJ\A1 28654 
genbank_R97018 
entrez_D64158 D64158 
tigr_HT2245 M69181 M81 105 U51039 



AA026349 



AA086005 

N22401 

AA128654 

R97018 
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Table 4A shows 202 genes up-regulated in samples from patients treated with chemotherapy or radiotherapy. These genes were selected from 59680 probesets on the 
Eos/Affymetrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average intensity (Al), a normalized value reflecting 
the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 
ExAccn; Exemplar Accession number, Genbank accession number 
UnigenelD: Unigene number 
Unigene Title: Unigene gene title 



R1: 



average of Al for samples from patients treated with chemotherapy or radiotherapy divided by the average of Al for normal lung samples. 
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113494 T91451 Hs.86538 ESTs 22.80 

113560 T91015 Hs.268626 ESTs 22.80 

113849 M457211 Hs.8858 bromodomain adjacent to zinc finger doma 51.80 

1 1 3950 AI267652 Hs.30504 Homo sapiens mRNA; cDNA DKF2p434E082 (fr 28.20 

5 114339 AA782845 Hs.22790 ESTs 20.20 

114365 H42169 Hs.18653 hypothetical protein FL) 14627 21.00 

1 14455 H37908 Hs.271616 ESTs, Weakly similar to ALU8_HUMAN ALU S 25.80 

114518 AW163267 Hs.1 06469 suppressor of var! (S.cerevisiae) 3-like 23.60 

114824 M960961 Hs.305953 zinc finger protein 83 (HPF1) 27.20 

10 114837 BE244930 Hs.166895 ESTs 30,20 

114974 AW966931 Hs.179662 nucleosome assembly protein 1 -like 1 20.80 

115075 AA814043 Hs.88045 ESTs 30.60 

115084 BE383668 Hs.42484 hypothetical protein FLJ 1061 8 28.86 

115291 BE545072 Hs.122579 hypothetical protein FLJ 10461 38.00 

15 115313 M808001 Hs.184411 albumin 22.60 

115697 D31382 Hs.63325 transmembrane protease, serine 4 173.60 

1 1 5909 AW872527 Hs.59761 ESTs, Weakly similar to DAP1.HUMAN DEATH 27.77 

116090 AI591147 Hs.61232 ESTs 20.80 

116107 AL133916 Hs.172572 hypothetical protein FLJ20093 164.20 

20 116399 M889120 Hs.110637 homeoboxA10 38.00 

117099 H93699 gb:yv16a11.s1 Soares fetal liver spleen 21.60 

117881 AF161470 Hs.260622 butyrate-induced transcript 1 49.40 

118091 AW005054 Hs.47883 ESTs, Weakly similar to KCC1 .HUMAN CALCI 22.40 

118138 AA374756 Hs.93560 Homo sapiens mRNA for KIAA1 771 protein, 22.00 

25 118720 N73515 gb:za49d07.s1 Soares fetal liver spleen 20.00 

118873 AI824009 Hs.44577 ESTs 19.40 

119126 R45175 Hs.117183 ESTs 111.20 

119717 AA918317 Hs.57987 B-cell CLUymphoma 11B (zinc finger pro 33.00 

119940 AL050097 Hs.272531 DKFZP586B031 9 protein 31.00 

3 0 1 20266 AI807264 Hs.205442 ESTs, Weakly similar to T34036 hypotheti 20.20 

120515 AA258356 gb:zr59c10.s1 Soares_NhHMPu_S1 Homosapi 25.00 

120859 AA826434 Hs.1619 achaete-scute complex (Drosophila) homol 95.40 

120983 AA398209 Hs.97587 EST 105.20 

121054 AW976570 Hs.97387 ESTs 38.80 

35 121369 AW450737 Hs.128791 CGI-09 protein 41.60 

1 22335 AA443258 Hs.241 551 chloride channel, calcium activated, fam 30.80 

122612 AA974832 Hs.128708 ESTs 19.60 

123130 AA487200 gb:ab19f02.s1 Stratagene lung (937210) H 33.20 

123440 AI733692 Hs.112488 ESTs 23.17 

40 123596 AA421130 Hs,112640 EST 23.00 

123619 AA602964 gb:no97c02.s1 NCI_CGAP_Pr2 Homo sapiens 28.80 

124006 AI147155 Hs.270016 ESTs 77.60 

124169 BE079334 Hs.271630 ESTs 22.20 

124281 A1333756 Hs.1 11801 arsenate resistance protein ARS2 42.20 

45 124472 N52517 Hs.102670 EST 32.60 

124617 AW628168 Hs.152684 ESTs 21.80 

124631 NM_014053 Hs.270594 FLVCR protein 30.40 

124839 R55784 Hs.140942 ESTs 21.20 

125186 AA610620 Hs.181244 major histocompatibility complex, class 42.80 

50 125321 T86652 Hs.178294 ESTs 27.00 

125535 NMJJ13243 Hs.22215 secretogranin III 23.80 

125646 AA628962 Hs.75209 protein kinase (cAMP-dependent, catalyti 23.20 

125684 AW589427 Hs.158849 Homo sapiens cDNA: FLJ21 663 fis, clone C 21.20 

125724 AL360190 Hs.295978 Homo sapiens mRNA full length insert cDN 48.80 

55 125847 AW161885 Hs.249034 ' ESTs 31.00 

125934 AA193325 Hs.32646 hypothetical protein FLJ21 901 21.20 

126077 M78772 Hs.210836 ESTs 49.80 

126299 AW979155 Hs.298275 amino acid transporter 2 21.80 

_ 126395 AI468004 Hs.278956 hypothetical protein FLJ 12929 71.00 

60 126433 AA325606 gb:EST28707 Cerebellum II Homo sapiens c 23.20 

126509 R47400 Hs.23850 ESTs 23.80 

126538 AB030656 Hs.17377 coronin, actin-binding protein, 1C 23.10 

126666 AA648886 Hs.151999 ESTs 36.00 

126812 AB037860 Hs.173933 nuclear factor l/A 20.80 

65 126872 AW450979 gb:UI-H-BI3-ata-a-12-0-Ul.s1 NCI_CGAP_Su 46.29 

127046 AA321948 Hs.293968 ESTs 22.80 

127431 AW771958 Hs.175437 ESTs, Moderately similar to PC4259 ferri 30.00 

127489 AA650250 Hs.272076 ESTs 20.80 

127521 AW297206 Hs.164018 ESTs 25.20 

70 127742 AW293496 Hs.1 801 38 ESTs 28.00 

127925 AA805151 Hs.3628 mitogen-activated protein kinase kinase 2120 

127930 AA809672 Hs.123304 ESTs 20.54 

127968 AA830201 Hs.124347 ESTs 28.20 

127987 AI022103 Hs.124511 ESTs 19.60 

75 128116 H07103 Hs.286014 Homo sapiens, clone IMAGE:3867243, mRNA 20.40 

128609 NM_003616 Hs.102456 survival of motor neuron protein interac 34.40 

m777 AI878918 Hs.10526 cysteine and glycine-rich protein 2 53.80 

128949 AA009647 Hs.8850 a disintegrin and metalloproteinase doma 23.00 

129168 AI132988 Hs.109052 chromosome 14 open reading frame 2 37.60 

80 129404 AI267700 Hs.317584 ESTs 28.60 

129527 AA769221 Hs.270847 delta-tubulin * 40.80 

129574 AA026815 Hs.11463 UMP-CMP kinase 31.20 

129598 N30436 Hs.11556 Homo sapiens cDNA FLJ 12566 fis, clone NT 29,60 

Q _ 129785 H19006 Hs.184780 ESTs 72.20 

o5 129970 AV655806 Hs.296198 chromosome 12 open reading frame 4 22.20 
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130149 


AW067805 


1 l_ 4T)COC 

Hs.1 /2bbb 


methylenetetrahydrofolate dehydrogenase 


/y.ou 


130199 


Z48579 


Hs. 172028 


a disintegrin and metaltoproteinase doma 


97 an 


130441 


U63630 


ns.loobo/ 


protein kinase, DNA-activated, catalytic 


9R 

10.00 


130466 


W19744 


Hs. 180059 


Homo sapiens cuna rLJAibbo tis, clone w\ 


on 9H 


130482 


AW409701 


Hs.1578 


baculoviral IAP repeat-containing 5 (sur 


22 40 


130617 


M9Q516 


Hs.1674 


glutamine-fructose-6-phosphate transamin 


1Q fin 


130703 


R77776 


i i_ a oa no 
HS. 18103 


ESTs • 


iy,^u 


130732 


AWoaU4b/ 


Hs.bdyo4 


cadherin 13, H-cadherin (heart) 




4 0flQR7 

IJUob/ 


MM f\(\4{Y71 


Uo OHA01Q 


uur giycosyiudiisierdoc i idiniiy, puiypu 


110.00 


1o1U2o 


Aiovyibo 


Up- 9997 

r\S. ZZZI 


rPAAT/onhanrior hi'nriinn nrnforn /rVPRP^ 


25.20 


1 JTUou 




He 99R1 


unrurnuy rdiiiii □ ^bcuciuyidiiui \j 


40.60 


131284 


N!vl_00l42y 


Uo 9K979 


E1A binding protein p300 


24.60 


131775 


Ab014o4o 


Hs.oiy/i 


naaud^o proiein 


21.00 


4 14 ocn 


DCOQOG7C 
DCOOOb/D 


HS.O04 


r\no guanine nucieouue exuiidiiye iduiui \ 


33.40 


131945 


NM_00291b 


Hs.obizu 


repiication factor C (activator 1)4 (37 


60.80 


•toon/in 

132040 


nil nn44Q£ 

NM_00119b 


Llr» 91KGD0 

Hs.oioooy 


riorno sapiens cuinm. rLo/icoio tis, cioiic n 


20.40 


132084 


NM_Q022b7 


HS.obob 


karyopherin alpha 3 (importin alpha 4) 


9Q AO 


132389 


a a o4 nooo 
AA31 Ud9o 


HS.iyUU44 


Co IS 


32.40 


132437 


a A'jcmic 
AAlbilUb 


ns.4ooy 


cyclin L am a- 6a 


27.40 


132550 


AW9b925J 


HS.l /uiyo 


bone morphogenetic protein 7 (osteogenic 


I O.DU 


132617 


AF037335 


Hs.5338 


carbonic anhydrase Xil 


^1 ^fi 
O 1 .00 


•1 onCOO 

1 6Zb6Z 


AUO/byib 


Hs.ooao 


guanine monphosphate synthetase 


32.40 


132672 


W27721 


HS.b4by/ 


Cdc42 guanine exchange factor (GEF) 9 


23.40 




AAU<co4oU 


Ue 909ft 19 


PQTc Woaklu similar in T^^4fiR hunnthpti 
co I s, vvedKiy biniiidi iu i oohoo ny puu iuu 


61.20 


132771 


Y 10275 


HS.bb4U/ 


phosphoserine phosphatase 


22.33 


4 iimn 
looU/U 


J IOOC/1Q 

Uy2b4s 


Up P.A14 4 

nS.b4ol l 


n rlioinfonrin onrl m^f?allnr\mforricico Hnma 

a oisiniegnn diiu mcidiiupiuiciiidbB uunid 


23.50 


looloo 


A cmnzQi 


MS.bb i/U 


nor\[vi-D proiciu 


30.00 


133181 


a9vdo2 


Up CR7AA 

HS.ob/44 


IWISI {uiosopnua/ nuiuoioy [dui vL&piidujs 


23.80 


4 O09QO 


A AvMQfHE 

AA44yU10 


MS.^obl4b 


CDQ7 /nmnmccnr nf DMA nrihimoraco R UO 
OKO/ ^SUppicobOr 07 r\tNrt (JUiyniUldbc D, yc 


51.60 


133350 


AI499220 


Hs.71573 


hypothetical protein FLJ 10074 


w nn 


133592 


AV6520OD 


Hs.75113 


general transcription factor ll!A 


ft9 nn 


133658 


A A O A t\A AC 

AA319146 


Hs.75426 


secretogranin II (chromogranin C) 




133865 


A nA4 A A r- rf 

AB011155 


1 1_ a "7rtonr» 

Hs, 170290 


discs, large (Drosophila) homolog 5 


oy.oo 


134032 


NM_005025 


|_J_ 7QCOQ 

HS.78589 


serine (or cysteine) proteinase inhibito 


W 90 


134125 


til i e\A A-I04 

NM_0 14781 


I i_ cnA r )4 

HS.50421 


i\iAAU2Uo gene proauci 


^1 fin 

0 I.OU 


134158 


U15174 


1 1— -jriAno 

Hs.79428 


BCL2/adenovirus E1B 19kD-interacting pro 


fin 
ou.ou 


134321 


BE538082 


Hs.8172 


ho I s, Moderately similar to A4binu A-iin 


iO.*tU 


1 34367 


AA^Q44q 


Hs 82285 


phosphoribosylglycinamide formyltransfer 


49.20 


134570 


U66615 


Hs. 172280 


SWI/SNF related, matrix associated, acti 


20.20 


134753 


NM_006482 


Hs.173135 


dual-specificity tyrosine-(Y)-phosphoryl 


20.80 


135002 


AA448542 


Hs.251677 


G antigen 7B 


37.60 


135029 


H58818 


Hs.1 87579 


hydroxysterold (17-beta) dehydrogenase 


53.40 


135047 


AL134197 


Hs.93597 


cyclin-dependent kinase 5, regulatory su 


31.60 


135345 


X53655 


Hs.99171 


neurotrophin 3 


28.80 



TABLE 4B shows the accession numbers for those primekeys lacking unigenelD's for Table 4A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Pkey; Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey 

123619 
126433 
126872 

106851 
118720 
120515 
117099 
101447 
123130 



CAT number 



Accessions 



371681 1 AA602964AA609200 
127143 1 AA325606 AA099517 N89423 

142696 J AW450979 AA1 36653 AA1 36656 AW419381 AA984358 AA492073 BE1 68945 AA809054 AW238038 BE01 1212 BE011359 

BE011367 BE011368 BE011362 BE011215 BE011365 BE011363 
322947 1 AI458623 AA639708 AA485409 R22065 AA485570 
genbankjtf3515 N73515 
genbank_AA258356 AA258356 
321871 1 H93699H97976H80036 
entrez_M21305 M21305 
genbank_AA487200 AA487200 
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Table 5A shows 680 genes up-regulated In squamous cell carcinoma or adenocarcinoma lung tumors relative to normal lung and chronically diseased lung. These genes were 
selected from 59680 probesets on the Eos/Affymetrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average 
intensity (Al), a normalized vatue reflecting the relative level of mRNA expression. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title . , 

R1: 70th percentile of Al for squamous cell carcinoma and adenocarcinoma lung tumor samples divided by the 90th percentile of Al for normal and chronically 

diseased lung samples. 

R2: 80th percentile of Al adenocarcinoma lung tumor samples divided by the 90th percentile of Al for normal and chronically diseased lung samples. 

R3: 80th percentile of Al squamous cell carcinoma lung tumor samples divided by the 90th percentile of Al for normal and chronically diseased lung samples. 

R4: 80th percentile of Al adenocarcinoma lung tumor samples divided by the 80th percentile of Al for squamous cell carcinoma lung tumor samples. 

R5: 70th percentile of Al for squamous cell carcinoma and adenocarcinoma lung tumor samples minus the 1 5th percentile of Al for all normal lung, chronically 

diseased lung and tumor samples divided by 90th percentile of Al for normal and chronically diseased lung samples minus the 1 5th percentile of Al for ail 

normal lung, chronically diseased lung and tumor samples 



Pkey 


ExAccn 


unigsneiu 


lUUUdO 






lUOUdb 






lOUUd/ 






1UUU/1 


Aztt tU<£ 




1UU114 


Al)2dUo 


Ue P0QG9 


luu I04 


noUi £\J 


nS.O loifc 


lUUloY 


U1 / /yo 


Ue 7Q1P'5 
MS./Olod 


lUUloo 


AWZ4/uyu 


nS.O/ l\J 1 


1UU2U2 


bh2y44U/ 


Ue QQQ1 n 

ns.yyy i u 


10U21b 


AA4byyUo 


ns.idyu 


iuu2by 


NM_uuiy4y 


ns.nby 


lUUZtif 


A 1 1H7RRK7 


ns.ioui/ 




Al lfl770EQ 


Ue AQOAOQ 

ns.ioZ4zy 


TUUddU 


AW41Uy/0 


Uc 771 K9 


TUUddb 


AW24/0£y 


Uc R7Q'i 

ns.b/yo 


lOOdbO 


\M7f\A 71 

W70171 


Uc 7CQO.G 

ns./oyoy 


100372 


MM (\AA70A 


Ue A QA11Q 

MS.lo4ddy 


100474 


NlvL00Qo99 


nS.dUUzbU 


1UU40D 


T19006 


Uc APiQAO 

nS.1Uc34Z 


100491 


D56165 


Uc 07C-fC3 

ns.2/b1bd 


100516 


U902/O 


Uc A A 

MS.n 


A ftftCOO 

100522 


X51501 


ns.yyy4y 


a ftficcn 

100559 


NM_uuuuy4 


Uc a a An 
t1S.1b4U 


1 00576 




Uc 17f\RR 

MS.df Ubo 


1 UUD2y 


A AMI CCQ*3 

AAUIbbyj 


Uc 01 OQ1 


AnnRRA 


bbbZoUUI 


Ue 1Q07/Q 

ns. loz/ ho 


lUUo/ f 


AAdodbob 


Ue K7fl1^ 


100596 


r\A AQQ~7 

U14boY 


Uc AOARQR 

MS.Iz joob 


luuvuy 


Nobody 


Uc AnnARQ 

nS.lUU4by 


inn7Ci 
lUU/Ol 


□t^uo4yi 


Ue OQR110 


lOUodO 


AuUU4f IV 


Uc y!7CC 

nS.4/bb 


Af\r\QR7 
lUUbbf 


U 14b 22 




luuyuz 


Ml cnoo 

iviibuzy 


Ue 007070 


luuyub 


Al I/Y7GQ1 R 

AUU/byib 


Ue C*3QQ 

ns.odyo 


100950 


JUU 124 


Ue 11770Q 

MS.1 1 / ( £d 


1U1U45 


JUob14 




101051 


NM_UUU1 fo 


Uc a oncao 


101071 , 


LU2o4U 


Ue QAOAA 

NS.04Z44 


101124 


1 AMAI 

L1Ud4d 


Ue HO'iA'i 

ns.i \£.jh \ 


101175 


1 IQ1G71 


Uc 0£OQft 

ns.dbyoU 


IU I 10 1 


XjiIcJjcXjc. I 


Uc 7T7QR 
no, / oi oo 


101204 


L24203 


Hs.82237 


101210 


L29301 


Hs.2353 


101216 


AA284166 


Hs.84113 


101228 


AA333387 


Hs.82916 


101233 


AL135173 


Hs.878 


101273 


Z11933 


Hs.182505 


101342 


U52112 


Hs.1 82018 


101346 


AI738616 


Hs.77348 


101369 


NM.000892 


Hs.1901 


101396 


BE267931 


Hs.78996 


101431 


BE185289 


Hs.1076 


101448 


NM_000424 


Hs.195850 


101462 


AL035668 


Hs.73853 


101466 


BE262660 


Hs.170197 


101484 


AA053486 


Hs.20315 


101502 


M26958 




101505 


AA307680 


Hs.75692 


101526 


NMJ)02197 


Hs.1 54721 


101535 


X57152 


Hs.99853 


101577 


M34353 


Hs.1 041 


101649 


AW959908 


Hs.1 690 


101663 


NMJJ03528 


Hs.2178 


101664 


AA436989 


Hs.121017 


101669 


L24498 


Hs.80409 



Unigene Title 

AFFX control: GAPDH 
AFFX control: GAPDH 
AFFX control: GAPDH 
Human GABAa receptor alpha-3 subunit 
thymidylate synthetase 
KIAA0101 gene product 
aldo-keto reductase family 1 , member C3 
minichromosome maintenance deficient (S. 
phosphofructokinase, platelet 
proteasome (prosome, macropain) subunit, 
E2F transcription factor 3 
chaperonin containing TCPt, subumt 5 (e 
protein disulfide isomerase-related prot 
minichromosome maintenance deficient (S. 
platelet-activating factor acetylhydrola 
uridine monophosphate kinase 
KIAA0175 gene product 
amylase, alpha 2A; pancreatic 
RAN, member RAS oncogene family 
non-metastatic cells 2, protein (NM23B) 
carcinoembryonic antigen-related cell ad 
prolactin-induced protein 
collagen, type VII, alpha 1 (epidermolys 
calcitonin/calcitonin-related polypeptid 
mitogen-activated protein kinase kinase 
Homo sapiens ribosomal protein L39 mRNA, 
zinc ribbon domain containing, 1 
general transcription factor IfA, 1 (37k 
myeloid/lymphoid or mixed-lineage leukem 
KIAA0618gene product 
flap structure-specific endonuclease 1 
gb:Human transketolase-like protein gene 
ret proto-oncogene (multiple endocrine n 
guanine monphosphate synthetase 
keratin 14 (epidermolysis bullosa simple 
gb:Human proliferating cell nuclear anti 
glucose phosphate isomerase 
potassium voltage-gated channel, Shab-re 
protease inhibitor 3, skin-derived (SKAL 
melanoma antigen, family A, 2 
macrophage migration inhibitory factor ( 
ataxia-telangiectasia group D-associated 
opioid receptor, mu 1 
cyclin-dependent kinase inhibitor 3 (CDK 
chaperonin containing TCP1, subunit 6A ( 
sorbitol dehydrogenase 
POU domain, class 3, transcription facto 
interleukin-1 receptor-associated kinase 
hydroxyprostaglandin dehydrogenase 15-(N 
kallikrein B, plasma (Fletcher factor) 1 
proliferating cell nuclear antigen 
small proline-rich protein 1B (cornifin) 
keratin 5 (epidermolysis bullosa simplex 
bone morphogenetic protein 2 
glutamic-oxaloacetic transaminase 2, mit 
interferon-induced protein with tetratri 
gb:Human parathyroid hormone-related pro 
asparagine synthetase 
aconitase 1, soluble 
fibrillarin 

v-ros avian UR2 sarcoma virus oncogene h 
heparin-binding growth factor binding pr 
H2B histone family, member Q 
H2A histone family, member A 
growth arrest and DNA-damage-inducible, 



R1 R2 R3 



R4 



8.00 



3.84 
3.33 



2.55 



5.07 



3.10 
3.85 



2.57 

3.12 
3.50 

4.08 

2.53 

8.50 

3.24 
8.31 

10.50 
4.02 



54.00 

5.59 

7.00 



7.20 



8.60 



7.60 

10.20 
8.00 



12.91 



24.80 



15.65 



14.20 



9.30 
20.60 



10.00 



6.40 



21.89 
12.80 



38.80 
12.00 



R5 

6.76 
5.77 
5.75 

5.71 



4.52 
5.49 
5.67 

5.66 
3.81 
4.50 

4.82 
3.79 

5.49 
4.17 



5.16 

4.69 
4.19 



5.69 



7.90 
4.45 

4.17 



7.90 

4.01 

4.46 
4.65 



7.60 
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101695 M69136 Hs.135626 

101724 111690 Hs.620 

101748 NM.001944 Hs.1925 

101759 M80244 Hs.184601 

5 101771 NM_002432 Hs.153837 

101804 M86699 Hs.169840 

101809 M86849 Hs.323733 

101833 AU076442 Hs.1 17938 

_ 101842 M93221 Hs.75182 

10 101851 BE260964 Hs.82045 

102002 NMJJ02484 Hs.81469 

102039 AL134223 Hs.306098 

102072 U09410 Hs.78743 

102083 T35901 Hs.75117 

15 102111 L36196 Hs.81884 

102123 NM_001809 Hs.1594 

102154 U17760 Hs.75517 

102193 AL036335 Hs.313 

102217 AA829978 Hs.301613 

20 102224 NM_002810 Hs.148495 

102234 AW163390 Hs.278554 

102251 NM_004398 Hs.41706 

102305 AL043202 Hs.90073 

102330 BE298063 Hs.77254 

25 102340 U37055 Hs.278657 

102348 U37519 Hs.87539 

102368 U39817 Hs.36820 

102394 NMJ)0381fr Hs.2442 

102404 NMJJ05429 Hs.79141 

30 102537 U57094 Hs.50477 

102581 AU077228 Hs.77256 

102605 AI435128 Hs.181369 

102610 U65011 Hs.30743 

102623 AW249285 Hs.37110 

35 102642 AA205847 Hs.23016 

102654 AV649989 Hs.24385 

102659 BE245169 Hs.211610 

102669 U71207 Hs.29279 

102672 U72066 Hs.29287 

40 102687 NML.007019 Hs.93002 

102696 BE540274 Hs.239 

102768 U82321 

102781 BE258778 Hs.108809 

102784 U85658 Hs.61796 

45 102824 U90916 Hs.82845 

102829 NM__006183 Hs.80962 

102888 AI346201 Hs.76118 

102892 BE440042 Hs.83326 

102913 NM_002275 Hs.80342 

50 102935 BE561850 Hs.80506 

102951 X15218 Hs.2969 

102983 BE387202 Hs.118638 

103023 AW500470 Hs.1 17950 

103036 M13509 Hs.83169 

55 103038 AA926960 Hs.334883 

103060 NMJJ05940 Hs.155324 

103099 AI693251 Hs.8248 

103119 X63629 Hs.2877 

103168 X53463 Hs.2704 

60 103185 NMJ306825 Hs.74368 

103192 M22440 Hs.170009 

103223 BE275607 Hs.1708 

103242 X76342 Hs.389 

103316 X83301 Hs.324728 
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TABLE 5B shows the accession numbers for those primekeys lacking unigenelD's for Table 5A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey 

117079 
124305 
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109792 
126034 
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126345 
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AI566663 AW51 2676 A1570580 AI023690 AA44821 6 AI079853 AI422707 AA77951 6 AW026972 AW1 30082 AW1 62307 AW438646 AA709332 
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AA21 9425 AA629658 AI81 1 71 9 AW41 1 275 AI590981 W37907 AI591 178 AI684051 AA983238 AA669347 AA976239 M704570 AI628339 
AI884391 AI241580 A1003539 AW1 76687 AA009650 N34566 AI333493 Al 186070 M070827 AA41 1683 AI280884 AA872023 AA207255 
AA021576 N71953 A1885888 AW076039 T15777 AI537673 AW248048 H09554 W93480 W47001 AW0791 14 AA063160 AA757453 R60788 
AI859431 H20478 AA218882 M757465 AA100995 AI864135 AI934209 AA070503 H47008 AA219646 W61039 W93907 AW385050 W37967 
W78028 M189007 M4791 36 R93650 M44231 2 T30287 AA847628 AA1 80262 AA009649 C03892 AW149464 M31 0963 AA21 9693 
AA069747 R29207 AA094784 AA29361 5 AA447848 AI9841 67 N90393 C05097 N56499 AW292351 AW1 49681 AW473258 AA629322 A1004409 
AW105577AI954937AI811070 M902422AW514437M535460M916877AW517122M974657M975649AW517130AW517129 F31737 
W07688 M193645 AA378994 M489273 F32267 W39303 AA021 181 N86810 M406524 AA062553 AA436801 H08985 H1 5979 N4031 0 
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AA436789 AA232172 AW360778 W25862 R60282 AA436530 AA378894 AA1 87461 A1940535 AA604210 AA0895U AA360421 N88243 N84281 
AA209340 N561 74 N88374 AA191088 AW247691 AA24901 3 M0931 1 1 M972536 AW298594 AA375893 T1 21 39 W281 86 AW243849 
AI288629 AA843996 W15260 AH88286 AW248079 R15836 

1 1 9599 genbank_W45552 W45552 

112382 genbank_R59904 R59904 

105264 genbank_AA227934 AA227934 

100071 entrez_A28102 A28102 

123315 714071J AA496369 M496646 



Table 6A shows 99 genes up-regulated nonsmokers with lung cancer relative to smokers with lung cancer. These genes were selected from 59680 probesets on the 
Eos/Affymetrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average intensity (Al), a normalized value reflecting 
the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : average of Al for samples from non-smokers with adenocarcinoma divided by the 90th percentile of Al for samples from smokers with adenocarcinoma 

R2; average of A) for samples from non-smokers with squamous cell carcinoma divided by the 90th percentile of Al for samples from smokers with squamous cell 
carcinoma 
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TABLE 6B show the accession numbers for those primekeys lacking unigenelD's for Table 6A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 



Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT number Accessions 

108562 36375 1 AA 1 00796 AF020589 AA074629 AA075946 AA1 00849 AA085347AA1 26309 AA0793 11 AA079323 AA085274 

103439 35330 1 X98266N41124 

123551 genbank_AA608837 AA608837 

123861 genbank_AA620840 AA620840 

102832 entrezJJ92015 U92015 

101972 entrez_S82472 S82472 

121558 genbank_AA412497 AA412497 
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Table 7A shows 98 genes down-regulated in non-smokers with lung cancer relative to smokers with lung cancer. These genes were selected from 59680 probesets on the 
Eos/Affymetrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average intensity (Al), a normalized value reflecting 
the relative level of mRNA expression, 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title; Unigene gene title 

R1 : 90th percentile of Al for samples from smokers with adenocarcinoma divided by the average of Al for samples from non-smokers with adenocarcinoma. 

R2: 90th percentile of Al for samples from smokers with squamous cell carcinoma divided by the average of Al for samples from non-smokers with squamous cell 
carcinoma. 
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TABLE 7B shows the accession numbers for those primekeys lacking unigenelD's for Table 7A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were cfustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California), The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 

Pkey CAT number Accessions 

103207 30635__-4 X72790 

106566 120358J BE298210 AI672315 AW086489 BE298417 M455921 AA902537 BE327124 R14963 AA085210 AW274273 A1333584 A1369742 AI039658 

AI885095 AI476470 AI287650 A1885299 AI985381 AW592624 AW3401 36 AI266556 AA456390 AI31 081 5 AA484951 

116571 genbank_D45652 D45652 

1 1 8466 genbank„N66741 N66741 

101046 entrez„K01160K01160 

101941 entrez_S77583 S77583 

103351 entrez_X8921 1 X89211 

123130 genbank_AA487200 AA487200 
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Table 8A shows 1720 genes either up or down-regulated In lung tumors or chronically diseased lung relative to a broad collection of over 40 distinct normal body tissues. 
Chronically diseased lung samples represent chronic non-malignant lung diseases such as fibrosis, emphysema, and bronchitis. These genes were selected from 39494 
probesets on the Eos/Afrymetrix Hu02 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average intensity (Al), a 
normalized value reflecting the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: 70th percentile of Al for lung tumors divided by 90th percentile of Al for norma) lung 

R2: 70th percentile of Al for chronically diseased lung divided by 90th percentile of Al for normal lung 
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dynein, cytoplasmic, light intermediate 3.27 7.24 

mucin 4, tracheobronchial 2.54 1.88 

synaptonemal complex protein 2 1.00 0.91 

CD3-epsilon-associated protein; antisens 2.63 2.67 

adaptor-related protein complex 4, epsil 5.82 9.34 

KIAA1054 protein 3.66 3.18 

Homo sapiens mRNA; cDNA DKFZp564J062 (fr 2.44 6.77 

sine oculis homeobox (Drosophila) homolo 0.44 0.84 

UDP-N-acetylglucosamine:a-1,3-D-mannosid 4.18 5.64 

nuclear cap binding protein subunit 2, 2 1 .85 0.92 

SWI/SNF related, matrix associated, acti 2.04 2.1 3 

U6 snRNA-associated Sm-like protein LSm8 1 .44 1 .89 

Homo sapiens cDNA FU13540 fis, clone PL 0.51 1.10 

ESTs 2.64 4.87 

gap junction protein, beta 6 {connexin 3 5.34 2.68 

hypothetical protein FLJ22965 1 .00 1 .21 

SMS3 protein 0.52 1.24 

odd Oz/ten-m homolog 2 {Drosophila, mous 1 .00 1 .00 

MCT-1 protein 1.58 1.02 

NADH dehydrogenase (ubiquinone) 1 beta s 2.72 6.85 

ESTs 1.00 4.32 

Homo sapiens, clone IMAGB2823731, mRNA, 2.97 0.93 

S164 protein 0.80 0.95 

gb:yu66g1 1 .r1 Weizmann Olfactory Epithel 1 .68 5.04 

ESTs 2.70 7.98 

gb:Homo sapiens mRNA for immunoglobulin 4.25 8. 1 3 

gb:Human immunoglobulin heavy chain, V-r 3.91 8.68 

gb:Human autonomously replicating sequen 2.20 2.73 

hypothetical protein FLJ20920 0.54 1.02 

gb:Homo sapiens (clone WR4.1 OVH) anti-th 4.28 11 .57 

KIAA1555 protein 1.57 2.38 

ESTs 2.94 4.68 

gb:Homo sapiens mRNA for immunoglobulin 3.49 6.31 

hypothetical protein FLJ10494 0.80 2.74 

gb:H.sapiens mRNA for variable region of 1.13 0.77 

ESTs, Moderately similar to putative DNA 3.14 10.68 

hypothetical protein FLJ20051 3.04 8.24 

gb:H.sapiens rearranged Ig heavy chain ( 1 .80 1 .92 

hypothetical protein LOC57822 1 .00 1 .00 

ESTs, Weakly similar to T17330 hypotheti 0.53 0.67 

hypothetical protein FLJ12894 2.45 2.62 

Homo sapiens cDNA: FLJ231 37 fis, clone L 4.88 8.61 

gb;Homo sapiens clone 2A1 scFV anitbody 1 .41 1.86 

RAB22A, member RAS oncogene family 1 .51 1.19 

peptidyiprolyl isomerase (cyclophilin)-l 0.72 0.76 

gb:H.sapiens T-ceil receptor mRNA 1.17 3.90 

klnesin family member 13A 4.08 6.46 

zinc finger protein 180 (HHZ168) 2.50 4.37 

Pur-gamma 5.38 8.38 

NM23-H8 3.26 4.08 

DC2 protein 2.02 1.83 

myosin, light polypeptide, regulatory, n 1.32 3.95 

ESTs 0.77 0.53 

ESTs 0.24 0.63 

hypothetical protein FLJ10534 3.56 6.22 

ESTs 2.28 3.17 

protocadherin12 0.38 1.02 

ESTs 2.30 1.00 

Homo sapiens clone 24468 mRNA sequence 1.86 4.48 

p53 regulated PA26 nuclear protein 0.10 0.80 

ESTs 4.54 9.65 

ESTs, Weakly similar to Homolog of rat Z 0. 09 0.04 

ESTs, Weakly similar to unnamed protein 1 .00 1 .72 

gb;EST96097 Testis I Homo sapiens cDNA 5 4.96 9.14 

phosphatide acid phosphatase type 2C 2.06 2.02 

ATPase, (Na+)/K+ transporting, beta 4 po 1 .00 1 .24 

ESTs 1.08 1.43 

glucose phosphate isomerase 1.76 1.31 

karyopherin (importin) beta 3 2.30 2.57 

polymerase (RNA) II (DNA directed) polyp 3.10 5.79 

Homo sapiens cDNA FLJ 12363 fis, clone MA 5.06 11.86 

gb:xo43c12.x1 NCI CGAPJJU Homo sapiens 5.14 7.31 

ESTs, Weakly similar to ALU1_HUMAN ALU S 2.83 4.06 

gb:xu71a1 1 .x1 NCLCGAP_Kid8 Homo sapiens 1.15 2.35 

gb:xt68f05.x1 NCI_CGAP_Ut2 Homo sapiens 2.20 9.35 

gb:xp70b1 1 .x1 Ncl_CGAP_Ov39 Homo sapiens 4.85 6.28 

gb:xt66h02.x1 NCI_CGAP_Ut2 Homo sapiens 3.21 4.07 
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ribosomal protein S27a 6.50 

eukaryotic translation elongation factor 1 .88 

gb:FB21B7 Fetal brain, Stratagene Homo s 2.15 

gb:FB26F2 Fetal brain, Stratagene Homo s 5.88 

gb:FB7C1 Fetal brain, Stratagene Homo sa 5.59 

ribosomal protein S14 6.55 

gb:yb42d06.s1 Stratagene fetal spleen {9 6.18 

gb:yb73g01.s1 Stratagene ovary (937217) 2.64 

gb:yc04c12.s1 Stratagene lung (937210) H 0.53 

ribosomal protein, large, PI 6.49 

gb:yi87g02.s1 Soares placenta Nb2HP Homo 2.90 

gb:ym31a06.s1 Soares infant brain 1NIB H 1.00 

gb:yr78b06.s1 Soares fetal liver spleen 0.79 

gb:yy82d08.s1 Soares_multipIe_sclerosis_ 4.28 

gb:zd88h06.s1 SoaresJetal_heartNbHH19W 6.47 

ribosomal protein, large, PO 1 .34 

vimentin 3.40 

proteasome (prosome, macropain) 26S sub 2.93 

gb:zp38g1 2.s1 Stratagene muscle 937209 H 3.98 

glyceraldehyde-3-phosphate dehydrogenase 3.32 

gb:EST54044 Fetal heart II Homo sapiens 1 .00 

gb:zv26g05.s1 Soares_NhHMPu_S1 Homosapi 1.42 

gb:zx82c11.s1 Soares ovary tumor NbHOT H 2.18 

gb:zx02c05.s1 Soares Jotal_fetus_Nb2HF8_ 5.38 

gIyceraldehyde-3-phosphate dehydrogenase 4.16 

serine {or cysteine) proteinase inhibito 0.55 

gb:nh85e08.s1 NCI_CGAP_Br1 .1 Homo sapien 1 .95 

ferritin, light polypeptide 2.10 

ribosomal protein S23 3.33 

gb:nm75h11.s1 NCI_CGAP_Co9 Homo sapiens 1.33 

gb:nn13g09.s1 NCI_CGAP_Co12 Homo sapiens 3.68 

KIAA1685 protein 2.77 

PRO2047 protein 7.16 

vimentin 2.47 

ESTs 6.78 

immunoglobulin heavy constant gamma 3 (G 0.90 

gb:zu89h06.s1 Soares_testis_NHT Homo sap 6.46 

gb:ab99c04.s1 Stratagene lung (937210) H 1.00 

gb:nr72a12.s1 NCI_CGAP„Pr24 Homo sapiens 5.68 

ESTs 1.48 

gb:nt01g08.s1 NCLCGAP w Lym3 Homo sapiens 1.76 

EST, Weakly similar to EF1 Dj-IUMAN ELONG 1 .00 

gb:ag57d1 2.s1 Gessler Wilms tumor Homo s 5.31 

glyceraIdehyde-3-phosphate dehydrogenase 0.78 

gb:ag37e01 .s 1 Jia bone marrow stroma Horn 3. 1 1 

nuclear factor of kappa light polypeptid 4.38 

gb:zj44f07.s1 Soares Jetal liver_spleen_ 2.13 

EST 1.20 

immunoglobulin heavy constant gamma 3 (G 1.16 

gb:ai10f08.s1 Soares_parathyroid_tumor_N 5.86 

gb:nx10c08.s1 NCLCGAP„GC3 Homo sapiens 2.21 

hypotheticai protein FLJ 1 1 726 3.36 

EST 1.00 

gb:nz12e05.s1 NCI_CGAP_GCB1 Homo sapiens 6.44 

hemoglobin, alpha 2 0.19 

gb:aj09h02.s1 Soares_parathyroidJumor_N 1.00 

ribosomal protein S18 7.57 

gb:oe29a1 2.s 1 NCI_CGAP_Pr25 Homo sapiens 4.78 

gb:oe29c1 2.s1 NCLCGAP_Pr25 Homo sapiens 0.89 
gb:nw31e04.s1 NCI_CGAP„GCB0 Homo sapiens4.49 

gb:ai67a05.s1 Soares_testis_NHT Homo sap 4.91 

ribosomal protein, large, P0 0.19 

gb:of34a02.s1 NCI_CGAP_Kid6 Homo sapiens 5.12 

gb:ak72b06.s1 Barstead spleen HPLRB2 Horn 1 .66 

gb:ak84a08.s 1 Barstead spleen HPLRB2 Horn 2.34 

ribosomal protein, large, P0 0.30 

gb:oh63h08.s1 NCI_CGAP_Kid5 Homo sapiens 2.10 

gb:nx21 h02.s 1 NCI_CGAP_GC3 Homo sapiens 0.32 
gb:am08b07.s1 Soares_NFL_T_GBC_S1 Homo s1. 56 

ribosomal protein S6 kinase, 90kD, polyp 5.21 

EST 1.96 

gb:ok03g03.s1 Soares.NFL^GBC^SI Homos 7.38 

gb:ok78g02.s1 NCI_CGAP_GC4 Homo sapiens 7.1 9 

gb:ok85h1 1 ,s 1 NCLCGAP_Kid3 Homo sapiens 6.50 



gb:og21a07.s1 NCI_CGAP_PNS1 Homo sapiens 4.21 



tRNA isopentenylpyrophosphate transferas 2.20 

gb:oo60g04.s1 NCLCGAP_Lu5 Homo sapiens 2.84 

gb:oi53h05.s1 NCI_CGAP_HN3 Homo sapiens 1.60 

interleukin 21 receptor 1.65 

ribosomal protein S18 . 3.78 

EST, Moderately similar to JC4662 ribos 4.30 

gb:op09d05.s1 NCI_CGAP_Kid6 Homo sapiens 0.95 

hypothetical protein FLJ20284 3.19 

gb:oq35e09.s1 NCLCGAPJ3C4 Homo sapiens 4.67 

gb:oq72e1 2.s1 NC!_CGAP_Kid6 Homo sapiens 3.92 
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gb:op33c06.s1 Soares_NFLJJ3BC_S1 Homo s 
ribosomal protein L18a 

gb:or84d07.s1 NCI CGAP_Lu5 Homo sapiens 
EST, Weakly similar to RL23_HUMAN 60S R 
gb:ou57e08.s1 NCLCGAP_Br2 Homo sapiens 
gb:os25c12.s1 NCLCGAP_Kid5 Homo sapiens 
gb:os18c10.s1 NCLCGAP_Kid5 Homo sapiens 
g!ycera!dehyde-3-phosphate dehydrogenase 
ribosomal protein, large P2 
gb:ou11b07.x1 Soares_NFLT_GBC_S1 Homos 
PRO2047 protein 

gb:ov29f10.x1 SoaresJestis^NHT Homo sap 
EST 

hemoglobin, alpha 2 

gb:ow70h 1 2.s1 Soares_fetalJiver_spieen_ 
ESTs 

gb:qa75h12.x1 Soares_fetaLheart_NbHH19W 
gb:qa33c06.s1 Soares_NhHMPu_S1 Homo sapi 
gb:am66f03.s1 Barstead spleen HPLRB2 Horn 
gb:am55e09.x1 Johnston frontal cortex Ho 
ribosomal protein L13a 

gb:qb85b1 2.x1 Soares_feta)_heart_NbHH1 9W 
gb:ox70h06.s1 Soares^NhHMPu^SI Homo sapi 
gb:qc99g06.x1 Soares^pregnanLuterus_NbH 
ferritin, light polypeptide 
EST 

CD68 antigen 
ESTs 

ribosomal protein S3A 

gb:qh92b02.x1 Soares_NFL_T_GBC_S1 Homos 
collagen, type I, alpha 2 

gb:qh30g11.x1 Soares_NFLJ"_GBC_S1 Homos 
gb:q!72d03.x1 Soares_NhHMPu_S1 Homo sapi 
gb:qu52f1 1 .x1 NCLCGAP_Lym6 Homo sapiens 
gb:qp65a1 2.x1 Soares Jetai_lung_NbHL1 9W 
gb:qm01f02.x1 Soares_NhHMPu_S1 Homo sapi 
ribosomal protein S19 

gb:tb17b01.x1 NCI_CGAPJ3v37 Homo sapiens 
EST, Weakly similar to RL6.HUMAN 60S Rl 
small nuclear ribonucleoprotein polypept 
gb:qt43b07.x1 Soares_fetal_lung_NbHL19W 
gb:qt27f07.x1 Soares_pregnanLuterus_NbH 
gb:qo26a07.x1 NCI_CGAP_Lu5 Homo sapiens 
gb:tc05d02.x1 NCLCGAP_Co16 Homo sapiens 
gb:qt18f09.x1 NCI_CGAP„GC4 Homo sapiens 
gb:qt09d02.x1 NCLCGAP_GC4 Homo sapiens 
gb:qt09g03.x1 NCI_CGAP_GC4 Homo sapiens 
gb:qt94a1 1 .x1 NCLCGAP_Co14 Homo sapiens 
EST, Weakly similar to R5HU22 ribosomal 
gb:qz08g05.x1 NCI_CGAPJXL1 Homo sapiens 
gb:tg02h05.x1 NCI_CGAP_CLL1 Homo sapiens 
eukaryotic translation elongation factor 
ESTs 

gb:ti60a08.x1 NCI_CGAP_Lym12 Homo sapien 
hemoglobin, alpha 1 

glyceraldehyde-3-phosphate dehydrogenase 
EST, Weakly similar to RL10_HUMAN 60S R 
eukaryotic translation elongation factor 
eukaryotic translation elongation factor 
gb:tj77e12.x1 Soares_NSF_F8_9W_OT_PA_P_S2.38 
EST 

gb:tn93d08.x1 NCI_CGAP_Ut2 Homo sapiens 
ESTs, Weakly similar to schlafen4 [M.mu 
anaplastic lymphoma kinase (Ki-1) 
gb:PT2.1J2_E04.r tumor2 Homo sapiens cD 
gb:PT2.1_13_H06.r tumor2 Homo sapiens cD 
gb:PT2.1J5_D07.r tumor2 Homo sapiens cD 
ribosomal protein S3 
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F10802 Hs.163628 ESTs, Moderately similar to ALU7.HUMAN 
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TABLE 8B shows the accession numbers for those Pkeys in Table 8A lacking unigenelD's. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
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"Accession" column. 



Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 
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TABLE 8C shows the genomic position for those Pkeys in Table 8A lacking unigene ID's and accession numbers. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication entitled "The DNA 

sequence of human chromosome 22." Dunham I. et al„ Nature {1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

Nt_position: Indicates nucleotide positions of predicted exons. 



Pkey 


Raf 


Strand 


Nt position 




0O£,IS£. 


Dunham, 


at al 


Pine 


7qqQH 7q7RQ 
(000 \ -l 0/00 


qq9R1fi 

OOZO 10 


Dunham, 


pt ai 


Pine 
r IUS 


ooyot»t-oouuou 


qq9qn.fi 

OOZyUO 


UUIIlldlll, 


Pt al 


Pine 
nub 


iy^o iu i - ly&Oiiuo 


OOCV 1 1 


Dunham, 


at al 

i. ei.ai. 


Pine 
rlUS 


1QR17fi7 1QR1ft^R 
i so 1 / 0 / - 1 yo 1 000 


oozy iz 


Dunham, 


o\ al 

. ei.ai. 


Pine 

rius 


i y Oil i zu- 1 y oz^*to 




uuiniam, 


ot al 

. ci.ai. 


Pine 
rius 


o(\{\qro{\ onnQ7^R 
zuuyozu-^uuy/ 00 


qqOQRfi 

oozyoo 


Dunham, 


ot 0) 

1. ei.ai. 


Pine 
rlUS 


CO lUOZO-ZO IUuOO 


oozyoy 


uunnam, 


at al 

. ei.ai, 


Pltic 
rlUS 


£.0x0 I *t0-^0 \Oc 10 


qqqi qo 
000 100 


uunnam, 


ot al 

. 6i. ai. 


Pine 
rlUS 


oooyiiuo-oooyozo 


^qqi 

oooioy 


Dunham, 


ot at 

. ei.ai. 


Pine 
rlUS 


oooy*tyo-oooyo/ 1 


9*30991 

OOOZZl 


Dunham, 


at ol 

. ei.ai. 


Dine 

rlUS 


q07R07n qQ7R1R7 

oy/oU/u-oy/oiof 


oqqqfln 
OOOOoU 


Dunham, 


at ol 


Dltle 

rlUS 


4oU*t/ /0-4yU*t040 


9qq9R7 
00000 1 


Dunham, 


at at 


Pine 
rlUS 


*ty iuyoo-*ty iuyy/ 


0000 I £ 


Dunham, 


at fit 
i 61,9). 


Plus 


00DU0 lU-ODOUOD't 




Dunham, 


at al 
. 61.31. 


Pine 
rlUS 


^R*i9fi9n *;fi*i97fln 


OqqRRR 

OoOooO 


Dunham, 


at ol 

. ei.ai. 


Pine 
• rlUS 


R9^A77R_ft9 ^/IRQ A 

ozo*t/ /o-ozo t toy4 


qqqfilfl 


DunhsfT), 


ot al 
. 81.31, 


Pine 
rlUS 


RRR9^Q1 RRR9SRR 


oqqR97 


Dunham, 


at ol 


Dine 
rlUS 


RR9riRR/l RR9AQnq 

oozuooi-oozuyuo 


OoooZo 


Dunham, 


at ol 

. et.ai. 


Dine 

rlUS 


oozyuu4-oozyzoo 


JjJOOU 


Dunham, 


at 0) 

. et.ai, 


rlUS 


0 / yoaoz-o ( y / 1 4.0 


q09R7fl 
0000 /o 


Dunham, 


at ol 

. et.ai. 


Dine 

rlUS 


/UOO/ZO-/UOOZOO 


qqq7sn 


Dunham, 


at ol 

. et.ai, 


Dine 

rlUS 


/OUOlOO-f OUOZ04 


°997C9 
OOO/OO 


Dunham, 


at ol 

. ei.ai. 


Dim* 

rlUS 


/oyz4y i - / oy/oou 


q097R7 
000/0/ 


Dunham, 


at ot 

. et.ai. 


Dine 

rlUS 


7RQylyin7 7RQ>lRO'3 

/oy44U/-/oy4ozo 


qoq7RR 

OOO/OO 


Dunham, 


at ol 

. et.ai. 


Dine 

rlUS 


"JRQRAAft 7RQRRQ7 

/oyo44u-/oyooy/ 


ooo /oy 


Dunham, 


. et.ai. 


rlUS 


/oyoo/o-foya/u/ 


q99779 
OOOf f Z 


Dunham, 


at ol 
. BUS). 


Dine 

rlUS 


770R77Q 77nRQf\0 

/ tvof fo-f /uoyuz 


099777 
000/ / / 


Dunham, 


at ol 

. et.ai. 


Dine 

rlUS 


77/RRnc 77/RQ4C 

i /4ooUo- / / 4oy 1 b 


ooJo4o 


Dunham, 


at ol 

. et.ai. 


Dine 

rlUS 


oUUoozo-oUUo /of 


999RR/I 

0oooo4 


Dunham, 


at ol 

. et.ai. 


Dine 
rlUS 


HI RqQRn R-l HA4 C<| 

0 1 ooy OU-0 1 Oh! 0 1 


q99RR7 

OOOoo/ 


Dunham, 


at ol 

. et.ai. 


Dine 

rlUS 


0 1 0400i-0 1 OOUZO 


q99fla*i 

ooooyi 


Dunham, 


at ot 

. et.ai. 


Dtne 

rlUS 


0 10040/-0 ioo/uy 


OoqflQO 

ooooyz 


Dunham, 


. et.ai. 


Pine 

rlUS 


R1 ERROR Qi C7nni 
OlOOoZO-OIO/UU I 


0ooy4o 


Dunham, 


of ol 

. et.ai. 


Dine 

rlUS 


RRRqiiQ7 RRflqR07 
0OoJ4y 1 -D0D00Z/ 


qoqQR/l 
OOOaOH 


Dunham, 


at ot 

. et.ai. 


Pine 
rlUS 


RER^IRR RRR^^^R 
0000 100-0000000 


qooORR 

oooyoo 


Dunham, 


at ol 

. et.ai. 


Dine 
rlUS 


RRRRR/1'3 RRRRROR 
0000040-OOOOOiiO 


OqqQRfl 

oooyoo 


Dunham, 


at ol 

. et.ai. 


Pine 
rlUS 


RRRIDH/l RRR10/11 
000 I UU*t-000 I L°t I 


qq/MR1 
00*tUO i 


Dunham, 


at at 

. ei.ai. 


Pine 
rlUS 


QRRRQ4.1 QRR7n77 
yoooy** i-yoo/u/ / 


oo*tuy*t 


Dunham, 


at al 

. et.ai, 


Pine 
rlUS 


QRRQQRq QRQHinR 

yooyyoo-yoyu iuo 


oovm -to 

004110 


Dunham, 


at ol 

. et.ai. 


Dine 

rlUS 


iuzo<:4oy- 1 u/ozoy / 


004 10 1 


Dunham, 


at at 

. et.ai. 


Pine 

rlUS 


1 uoy yuo j- 1 uoy y i ou 


09/I01Q 

oo4Ziy 


Dunham, 


at 0) 

. et.ai. 


Pine 

rlUS 


1 CI 1 0 1 OU-1 Z / 1 0do4 


qq/oqo 

oo^zoy 


Dunham, 


at ol 

. et.ai. 


Pine 

rius 


1 ouooooy- 1 ouoooyo 


oq/qqo 

OOhOOO 


Dunham, 


at al 

. et.ai. 


Pine 
rlUS 


1 001/00*14- 1 oouooo / 


99jiq7fl 
00*tO f 0 


Dunham, 


ot ot 

. et.ai. 


Pine 

rius 


I oy u / zoy- 1 oy u/ 0 / u 


qq/iqQO 
OOtOOZ 


uunnarn, i 


at al 


Pine 
"JUS 


1 RRRR-1 fimR 

1 oy j oooo- j oy \ duod 


00400Z 


Dunham, 


at ol 

. et.ai. 


Dine 

rlUS 


1/tQR7fl/l7 1AGR7Qiin 

i4yo/04/- i4yo/y4u 


OOylCRR 
004000 


Dunham, 


. et.ai. 


Dine 

rlUS 


10UOZ/4U-lOUOZOl / 


004010 


Dunham, i 


. et.ai. 


Dine 

HUS 


lOl/OlZO-lQtfOH/V 


90/R99 
004D00 


Dunham, I 


. et.ai. 


Dine 

rlUS 


1 00O0ZU0- 1 oOoodUo 


OO/tRRR 
004000 


Dunham, 


. et.ai. 


Dine 

rlUS 


1RR70')'l>t 1RR70qi7 
100/ cc 14- 1 00/ ZO 1 / 


99/1001 

oo4oy J 


Dunham, i 


at ol 

. et.ai. 


rlUS 


lyzyy/ /(/- iyzyyy44 


99/109;! 

O04yo4 


Dunham, I 


. et.ai. 


Plus 


ZUlUo9/U-ZUlU4Uoo 


ooouio 


Dunham, 


at ol 

. et.ai. 


Dine 

- rlUS 


zuooz/yz-zuoozy4o 


99E1 oa 


Dunham, I 


. et.ai. 


Dlj.e 

riUS 


2 1 4oozoo-i: 1 4oooo4 


99C19G 
0001 zo 


Dunham, I 


at ol 

. et.ai. 


Dine 

rlUS 


Zl 441 OaU-Zl 441 4/1 


qqR17q 

000 1 / 9 


rinnham I 

UUIIlldlll, I 


et.ai. 


Plus 




335188 


Dunham, I 


\eiai 


Plus 


21669118-21669328 


335211 


Dunham, I 


. et.ai. 


Plus 


21774611-21774680 


335361 


Dunham, I 


. et.al. 


Plus 


22807292-22807445 


335379 


Dunham, I 


etal. 


PIUS 


22899306-22899420 


335414 


Dunham, I 


etal. 


Plus 


23235546-23235684 


335416 


Dunham, I 


et.al. 


Plus 


23237354-23237465 


335496 


Dunham, 1 


etal. 


Plus 


24164386-24164545 


335497 


Dunham, 1 


etal. 


Plus 


24167666-24167869 


335558 


Dunham, 1 


. et.al. 


Plus 


24740167-24740347 


335586 


Dunham, 1 


etal. 


Plus 


24990333-24990497 


335686 


Dunham, 1 


etal. 


Plus 


25439839-25439920 


335784 


Dunham, 1 


et.al. 


Plus 


25942710-25942792 


335823 


Dunham, 1 


etal. 


Plus 


26365925-26366004 


335983 


Dunham, 1 


et.al. 


Plus 


27938968-27939070 


335995 


Dunham, 1 


etal. 


Plus 


28009044-28009184 


336021 


Dunham, 1 


et.al. 


Plus 


28686482-28686559 



147 



WO 02/086443 



oobU34 


Dunham, 


ot al 

. et.ai. 


□ Inc. 

rlUS 


zy u i 44U4-zy u 1 4oy u 




Dunham, 


at al 

. et.aJ. 


□lite 

rJUS 


9Q099QRO onnOQIRt? 
Zi) UZZybO-zyUZol 00 


OOC1 A7 
OOOlU/ 


Dunham, 


. etal. 


Dine 

rlUS 


0QQQ7701 0Q0 n 7flRQ 

zyytj/ /oi-zyyo/oby 


QOCCQO 


Dunham, 


. et.ai. 


Dine 

rlUS 


yoooyu-yooozy 


ooceoo 


Dunham, 


. at. a!. 


Dine 

rlUS 


0RCCQ1 QQCO01 

yuooy i-yobzzi 


oobbo4 


Dunham, 


. eta). 


rJUS 


a fl ft o Q C QO C C ~Jf\ 


oobboo 


Dunham, 


. et.ai. 


Plus 


QR7QnP QRRTK>1 

yo / yuo-y 00004 


oobboo • 


Dunham, 


of at 

. et.ai. 


Dine 

rlUS 


0QR>1 1 ft QRQ1 OR 

yoo4io-yoyioo 


33RRQ7 


Dunham, 


of al 

. ei.ai. 


Pine 
rlUS 


yoy^/u-yyuo io 


oobboo 


Dunham, 


of at 

. et.ai. 


Ditto 
rlUS 


yy i yuo-y yoz*tu 


oobooy 


Dunham, 


ot ol 

. et.ai. 


□lite 
rlUS 


ioya*tuz- 1 0304/0 


oobba4 


Dunham, 


at al 

. et.ai. 


Dine 

rlUS 


/-+ZU0 t to-t4*iUD IO 


ooe70i 

336/21 


Dunham, 


. et.al. 


Ohio 

rlUS 


00/ \vicc~66l 1 000 


ooeono 

oobyuu 


Dunham, 


al al 

. et.ai. 


Dine. 

rlUS 


lUZOOH^O- 1 U^ODOZo 


33b948 


Dunham, 


at al 

. et.ai. 


Dine 

rlUS 


i zoyzzy u- 1 zoyzoo \ 


OQ7AOQ 

337028 


Dunham, 


at at 

. et.al. 


Plus 


lbb44oi f-i bb44y4z 


337054 


Dunham, 


. et.a). 


P)us 


17DO*17>IO 17QO1Q00 

1/021/42-1 /o2iyzZ 


0071 CO 

3371 62 


Dunham, 


. et.al. 


Plus 


zo4 / oy4o-zo4 / y 1 40 


337183 


Dunham, 


. eta!. 


Plus 


2oy4obUb-2oy4obyb 


337184 


Dunham, 


. et.al. 


Plus 


zoy / oy4y-zoy / 4ui b 


33 /2oo 


Dunham, 


. et.ai. 


rJUS 


Oflf\*J 1 070 OflA-l OA'iyl 

zoui iy/y-zouizuo4 


337299 


Dunham, 


. et.al. 


Plus 


zyuzZbob-zyuzz/ to 


337389 


Dunham, 


. et.al. 


Dine 

rlUS 


o I4U iouy-0 14U10/ y 


ooY4yo 


Dunham, 


ot al 

. et.ai. 


Plus 


^<t^n7Rn ^T^nnQRi 
ooooUf ou-oooouyo i 


oo/o4y 


Dunham, 


at a) 

. et.a!. 


Plus 


1dd7AA79 ^AA7A^1 


OO77CC 
00/ /bb 


Dunham, 


ot at 

. et.ai. 


Plus 


^Q7l7fi4_^Q71Qnn 

oy/ i/o-t-oyr iyuu 


OO.70.OG 

oo/ouy 


Dunham, 


ot a) 

. et.ai. 


Dine 

rlUS 


AAAQOR.Q AAAQAQ'Z 

w+yuoy-*t-+*ty i yo 


337871 


Dunham, 


at al 

. et.ai. 


Dine 

rlUS 


HAAW)7 KAAliM 
0440UZ7-04401U1 


3o795o 


Dunham, 


at al 

. et.ai. 


Dine 

rlUS 


ftQRQ1R9 RQRQ97n 

oyoy ioz-oyoyz/u 


oooUUo 


Dunham, 


at a) 

. et.ai. 


Plus 


7RQ7nfifl 7RQ79'-IR 

i oy ( uoo- 1 oy t zoo 


ooonoo 
oooUoo 


Dunham, 


. et.a). 


Plus 


RflQ919R flnQ9971 

ouyz izo-ouyzz/ 1 


0001 * n 
33811 U 


Dunham, 


. eta). 


Plus 


in*^RA4R-l 'in^RAR9'l 

iuoo-i*io I- 1 uoa-toz i 


OOQ1 1 o 

338112 


Dunham, 


. et.al. 


Dine 

rlUS 


i uoy i oy o- 1 uoy i buu 


338145 


Dunham, 


. et.al. 


Plus 


11 oobb29-ll oobby i 


onoA AO 

338148 


Dunham, 


at al 

. etal. 


Dine 
PIUS 


n44oyoo-i i44yuoo 


OOQ17A 

338179 


Dunham, 


at al 

. et.aJ. 


Dine 

rJUS 


jZoUoV /o-12oUt3ooo 


oooi 07 
338197 


Dunham, 


. et.al. 


Dine 

rlUS 


•nftQRin7 i^R^R1R*l 
I0D0O lUf -10000 101 


338279 


Dunham, 


. et.al. 


Plus 


4CHRQQAA -iRiROnOn 

ibi 00344-1 bibyuyi 


OOQ01 R 

oo831 b 


Dunham, 


. etal. 


Dine 

rlUS 


17^07-1 A '!7nRQ0RP 

i /uoy/ 1 1- 1 /uoyyoo 


oooo/Z 


Dunham, 


ot al 

. et.ai, 


Dine 

rlUS 


17139477 17139547 
1 f IOZh/ /- 1 / IOZO*t/ 


00000/ 


Dunham, 


. et.al. 


Dine 

rlUS 


1flnR9lR/L1flnR9An9 
1 OUOZ 1 0*^ 1 ouoz*tuz 


OOQQEO 

oobooy 


Dunham, 


at al 

. et.ai, 


Dine 

rlUS 


I OU / HHUZ- 1 OU/^OU I 


oooobb 


Dunham, 


ot al 

. et.ai. 


Dine 

rlUS 


1fl9a9fl9R 1R9591RQ 

i ozozuzo- 1 ozoz i oy 


0000/4 


Dunham, 


ot at 

. et.ai. 


Dine 
rlUS 


1R°.719nn 1R3719R9 
100/ IZUU-iOO/ IZ0Z 


338414 


Dunham, 


. etal. 


Dine 

rlUS 


iyo*too/o- 1 yo^ooou 


OOQ/I 1 Q 
00841 0 


Dunham, 


ot al 

. et.ai. 


Dine 

rlUS 


iy*tooouo- 1 y*toooyo 


338501 


Dunham, 


. et.al. 


Dine 

rlUS 


91 94471 3 91 94AR9H 
Z I Z44/1 0-Zl Z4404b 


338506 


Dunham, 


at al 

. et.al. 


Dine 

rlUS 


01001Q71 01001QKQ 
ZliZlO/1-Zl2Ziy0d 


338523 


Dunham, 


. et.al. 


Plus 


21 buy /bo-21 buy ob4 


oaaaaa 

338662 


Dunham, 


. et.al. 


Dine 

rlUS 


Z44U4 /2U-244UH0y y 


338804 


Dunham, 


. eta). 


PJus 


z/2obUUo-2/20biUo 


338836 


Dunham, 


. et.al. 


Plus 


077001CC O77Q0070 

2/ /yzibb-z/ /yzz/z 


100070 

338879 


Dunham, 


al al 

. et.al. 


Plus 


zo41 Ubbo-Zo41 U f 04 


oo89oY 


Dunham, 


at al 

. et.ai. 


Dine 

rlUS 


"QIRnRRR 9Q1Rn79R 

zy louobo-zy iou/zo 


OQQQOO 


Dunham, 


ot al 

. et.ar 


Dine 
rlUS 


30A777R7 °.nn7R1fi4 


ooyu4/ 


Dunham, 


. et.al. 


Dine 

rlUS 


30760703 307R0QRR 

ou/ou/ yo-ouiouyoo 


oooi OA 


Dunham, 


ot al 

. et.ai. 


Plus 


311415R0 311417R5 
0 1 I *» I OOU-0 I I *♦ 1 1 00 


339114 


Dunham, 


. etal. 


Dine 

rlUS 


31 4 C R4R4_31 4RR5 1 Q 
0 1 ^00404-0 1 *tooo i y 


339121 


Dunham, 


at al 

. et.ai. 


Plus 


o 1 0oo4b /-o i ooooob 


OOA1 7 A 

339170 


Dunham, 


at al 

. et.ai. 


Dine 

rlUS 


3991 R^OQ 3991 RR97 

ozz i ooyy-o<iz i ooz / 


ooy^yo 


Dunham, 


. etal. 


Dine 

rlUS 


33993R71 33993R1Q 

ooccOvt i-oozzooiy 


OOOOCO 

OdZOOO 


Dunham, 


. et.al. 


Minus 


133QR07 133Q3Q7 

looyour- looyoy/ 


ooonoo 
002982 


Dunham, 


. et.al. 


Minus 


9R9R9QR 9R9R100 

ZDZOzyo-zozo iuy 


332984 


Dunham, 


at al 

. et.al. 


Minus 


OR39ROR 9R39/IR7 

zoozoub-zooz40 / 


ooonno 

332998 


Dunham, 


at al 

. et.ai. 


Minus 


9711 70/ 9711RRR 
Zf I I / U4-Z/ I I000 


OOOOCO 

333058 


Dunham, 


at al 

. et.al 


Minus 


OOOQQOC 009RR11 

oU2oyzo-OUZool 1 


ooonn-r 
333097 


Dunham, 


at al 

. et.al. 


Minus 


0*50/10/1 Q90^03R 
0ZU4 1 Z4-0ZU4U0D 


333121 


Dunham, 


at al 

. etal. 


Minus 


OOOQ/MR OOOfl'iRfl 

ooUo44b-ooUoobo 


oon oo 

3331 22 


Dunham, 


. et.al. 


Minus 


QOOQKQC QQ0QR31 

oouy oyo-oouyoo i 


OOO'l oo 

333123 


Dunham, 


. eta). 


Minus 


O01OR17 0Q1O7AQ 
OOlUOl /-OO I U/ W 


qoh vi a 
333140 


Dunham, 


at al 

. et.al. 


Minus 


0Q77990 '337R30Q 

oo/ / ccfj'do i oouy 


OOOOCA 

ooozbU 


Dunham, 


at al 

. et.al. 


Minus 


Vl^nfl/100 430R3nA 
*t0Uo < tUU-40U00U4 


OOOCAQ 

333603 


Dunham, 


at al 

. et.al. 


Minus 


R/RRQQR R/RR797 

b4bbooo-b4bb/z/ 


oo3bU4 


Dunham, 


a* al 

. et.al. 


Minus 


R4fi7nQ0_fi4flfi7RR 
040 ( uyu-onoo / 00 


333904 


Dunham, 


. et.al. 


Minus 


0917*37/1 R9179R1 
OZ l ( 0/4-OZ I f ZO I 


ooooag 
33390b 


Dunham, 


. et.al. 


Minus 


R91R90R R91R0R3 
OZ I ozoo-oz ■ OUOO 


OQ>M QO 

3341 83 


Dunham, 


. etal. 


Minus 


11R39RR9 11R39aTIA 
1 1 OOZOOZ- 1 1 OOZOUO 


334187 


Dunham, 


. et.al. 


Minus 


11Q91^a"R 11Q9190R 

nyzi4bb-i lyzizuo 


334222 


UUIIIldlll, 


ot al 
. oi. ai. 


Mini k 


12732417-12732289 


334223 


Dunham, 


.etal. 


Minus 


12734365-12734269 


334255 


Dunham, 


. etal. 


Minus 


13200776-13200692 


334492 


Dunham, 


.etal. 


Minus 


14478333-14478172 


334648 


Dunham, 


.etal. 


Minus 


15363301-15363222 


334787 


Dunham, 


. etal. 


Minus 


16299093-16298937 


334933 


Dunham, 


. etal. 


Minus 


20078117-20077991 



WO 02/086443 



334945 


Dunham, i. et.al. 


Minus 


20138885-20138637 


334967 


Dunham, I. et.al. 


Minus 


20173311-20173218 


334990 


Dunham, I. et.al. 


Minus 


20341159-20341087 


335093 


Dunham, I. et.al. 


Minus 


21297367-21297214 


335288 


Dunham, I. et.al. 


Minus 


22304275-22303770 


335289 


Dunham, I. etal. 


Minus 


22305950-22305708 


335548 


Dunham, l. et.al. 


Minus 


24662773-24662673 


335551 


Dunham, I. et.al. 


Minus 


24679828-24678961 


335619 


Dunham, I. et.al. 


Minus 


25082677-25082498 


335620 


Dunham, I. etal. 


Minus 


25092561-25092434 


335621 


Dunham, I. eta). 


Minus 


25098878-25098767 


335682 


Dunham, i. et.al. 


Minus 


25421215-25421093 


335755 


Dunham, 1. et.al. 


Minus 


25763806-25763747 


335814 


Dunham, I. et.al. 


Minus 


26320043-26319845 


335815 


Dunham, 1. et.al. 


Minus 


26320518-26320421 


335835 


Dunham, 1. etal. 


Minus 


26393311-26393245 


335851 


Dunham, 1. etal. 


Minus 


26604863-26604742 


335868 


Dunham, 1 eta). 


Minus 


26711437-26711300 


335896 


Dunham, 1. et.al. 


Minus 


26977639-26977558 


335936 


Dunham, ). etal. 


Minus 


27360474-27360400 


335948 


Dunham, 1. et.al. 


Minus 


27555924-27555788 


336066 


Dunham, 1. et.al. 


Minus 


29241080-29240842 


336205 


Dunham, 1. et.al. 


Minus 


30477456-30477311 


336275 


Dunham, 1. et.al. 


Minus 


32086675-32086536 


336292 


Dunham, 1. et.al. 


Minus 


32818035-32817927 


336331 


Dunham, 1. et.al. 


Minus 


33594527-33594371 


336419 


Dunham, 1. et.al. 


Minus 


34052568-34052445 


336675 


Dunham, 1. etal. 


Minus 


2020758-2020664 


336684 


Dunham, I. etal. 


Minus 


2158060-2157993 


336716 


Dunham, !. etal. 


Minus 


3259952-3259862 


336798 


Dunham, ). etal. 


Minus 


5888954-5888757 


337043 


Dunham, 1. et.al. 


Minus 


17407330-17407251 


337046 


Dunham, 1. et.al. 


Minus 


17610892-17610821 


337128 


Dunham, I. et.al. 


Minus 


22215251-22215034 


337192 


Dunham, 1. etal. 


Minus 


24591853-24591771 


337194 


Dunham, I. et.al. 


Minus 


24610510-24610359 


337229 


Dunham, 1. et.al. 


Minus 


26716579-26716481 


337325 


Dunham, 1. et.al. 


Minus 


30015948-30015800 


337497 


Dunham, 1. et.al. 


Minus 


33371317-33371258 


337500 


Dunham, 1. etal. 


Minus 


33376212-33376158 


337603 


Dunham, 1. et.al. 


Minus 


1299296-1299194 


337605 


Dunham, 1. et.al. 


Minus 


1346555-1346397 


337671 


Dunham, I. etal. 


Minus 


3260634-3260547 


337786 


Dunham, t etal. 


Minus 


4133203-4133081 


337862 


Dunham, 1. etal. 


Minus 


5347658-5347550 


338083 


Dunham, ). et.al. 


Minus 


9318438-9318301 


338158 


Dunham, 1. etal. 


Minus 


11794465-11794343 


338161 


Dunham, 1. et.al. 


Minus 


12124716-12124658 


338182 


Dunham, 1. et.al. 


Minus 


12824919-12824827 


338189 


Dunham, 1. et.al. 


Minus 


12878594-12878478 


338199 


Dunham, 1. etal. 


Minus 


13760865-13760780 


338215 


Dunham, 1. et.al. 


Minus 


14055447-14055355 


338469 


Dunham, 1. etal. 


Minus 


20520387-20520242 


338549 


Dunham, 1. etal. 


Minus 


22049171-22049081 


338561 


Dunham, 1. et.al. 


Minus 


22311966-22311856 


338671 


Dunham, !. etal. 


Minus 


24508421-24508346 


338676 


Dunham, 1. etal. 


Minus 


24637427-24637369 


338726 


Dunham, 1. et.al. 


Minus 


25926206-25925618 


338779 


Dunham, 1. etal. 


Minus 


27030151-27029795 


338871 


Dunham, 1. et.al. 


Minus 


28301708-28301611 


338872 


Dunham, 1. etal. 


Minus 


28300921-28300790 


338966 


Dunham, 1. et.al. 


Minus 


29614876-29614749 


339229 


Dunham, 1. et.al. 


Minus 


32722330-32722199 


339264 


Dunham, 1. et.al. 


Minus 


32975145-32975053 


325228 


6381940 Plus 


2630-2694 




325235 


6381943 Minus 


162154-162264 


329588 


3962484 Plus 


1169-1619 




329560 


3962491 Plus 


2095-2990 




329541 


3983503 Minus 


2765-3059 




325328 


5866875 Plus 


86780-86854 




325340 


6017033 Minus 


166656-166819 


325373 


5866920 Minus 


1136686-1136777 


325367 


5866920 Minus 


922881-922958 


325389 


5866921 Plus 


239672-239759 


325436 


5866939 Minus 


29778-29907 




325498 


5866967 Plus 


173372-173930 


325471 


6017034 Minus 


289268-289342 


325557 


6056302 Plus 


50921-51050 




325559 


6249595 Minus 


118590-119172 


325560 


6249595 Minus 


133794-133981 


325569 


6249599 Pius 


79927-80217 




325587 


6682462 Plus 


126724-126967 


325585 


6682462 Plus 


73476-73574 




325597 


5866992 Plus 


1065020-1065089 


325639 


5867002 Plus 


253525-253608 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 




325739 


5867038 

OU U I UJU 


Mini iq 

IVIII IUO 


205138-205269 


325740 


586703R 

www/ UJU 


Minn? 

1VJJI JUO 


207533-207690 


325792 


6469828 


Minn? 

IV III lUJ 


1018-1176 


325735 


6552447 

UWWJi*t*Tf 


Minus 


269122-269190 


325685 


6682468 


Plus 


117397-117483 


325686 


6682468 


Plus 


118337-118439 


325819 


6682490 


Minus 


130314-130370 


329764 


6048195 


Minus 


109733-109968 


329703 


6065793 

UUUJ / J J 


Minn? 

IVIII IUO 


139994-140138 


329643 


6448539 

U » rW J WW 


Plus 


53403-53537 


329816 


6624888 


Minn? 

IVIII IUO 


70296-70423 


329R60 


6687260 


Minn? 

JVIJJ JUO 


163474-163605 


325883 


5867087 

JUU / UU I 


Plus 


22498-22663 


395R95 


58670Q7 


Plus 


358317-358476 


325925 


5867124 


Plus 


115749-115962 


325932 

Ot, Ji7 Jt. 


5867127 


Plus 


7369-7441 


325941 


5867133 

wUwl Iww 


Minn? 

Will IUO 


64228-64402 


325969 

U4vv ww 


5867153 


Pius 


101911-102081 


325971 


5867153 


Plus 


105841-106035 


329993 

U^ww ww 


4567166 

TwUi lUw 


Minn? 

I VI 111 uo 


101307-101434 


330020 


6671887 


Plus 


172397-172491 




5867168 

OUv/ Iww 


Minn? 

1 VI II IUO 


7831-8035 


326274 


5867171 


Minn? 

ivm iuo 


410289-410404 


326025 


5867176 


Plus 


70854-70915 


326046 


5867182 

JUU# IUt- 


Minn? 

J VI II IUO 


62668-62825 


326099 


5867186 


Minus 


661381-661510 


326108 


5867187 


Minus 


23784-23903 


326165 


5867208 


Minus 


62787-62929 


326189 

w£U 1 Uv 


5867212 

www! £m 1 £- 


Plus 


69288-69413 


326204 


5867218 

JUU f £ I U 


Minn? 

1 VII J IUO 


148088-148200 


326230 

w£U£wU 


5867230 

JUU I LwU 


Minn? 

IVIII IUO 


301868-301972 


330052 


4567182 


Plus 


352560-352963 


330036 


6042048 


Plus 


117120-117216 


326360 


5867293 


Plus 


13627-13844 


32R5R9 


58R7320 

JUU f w£U 


Plus 


22760-22919 


326393 


5867341 


Plus 


41702-41841 


326505 


5867435 

wwU/ *tww 


Minus 


8818-8949 


326515 

w£AJw 1 J 


5867439 

JUU f *t Jw 


Plus 


36683-36809 


326592 


6138928 


Plus 


23689-23828 


330107 

UUU list 


6015249 


Minn? 


100091-100282 


330106 


6015249 

Uu | J(Ut? 


Minn? 

IVIII IUO 


99443-99778 


330100 


6015253 

ww I Jl. JU 


Pius 


21166-21301 


330093 


6015278 

WW 1 \J£m$ V 


Plus 


1043-1199 


330088 


6015293 


Plus 


37517-37638 


330085 


6015302 


MiniK 

IVIII iUO 


59613-59770 




6671864 


Minn r 


127553-127656 


330123 

WWW I 4- J 


6671869 

/ I Uww 


Minn? 

IVIII IUO 


35311-35406 


326742 


5867611 

UUUf U 1 1 


Minn^ 

IVIU IUO 


95187-95248 


326605 

U£UUUU 


5867637 

UUUf UU/ 


Plus 


24656-24749 


326818 


6117831 

will UO 1 


Minn? 


15199-15309 


326720 


6552456 

UJU67JU 


Plus 


84525-84677 


326770 


6598307 

ww JUwU r 


Minus 


513603-513668 


326692 


6682502 


Plus 


117697-117899 


326693 


6682502 


Minn? 

IVIII IUO 


335002-335095 


326983 

Jx-U JUU 


5867657 

wUU 1 wwl 


Minus 


16023-16581 


326991 


5867660 


Plus 


18147-18339 


326936 

w£.UwUU 


6004446 


Minu? 

IVIII IUO 


10217-10357 


326964 


6469836 

U*rUwUwU 


Pius 


75340-75456 


327040 


6531965 

uww i jyj 


Plus 


783670-783817 


327053 

Ofa / www 


6531965 

UwJ I wUw 


Plus 


2247267-2247437 


327075 


6531965 

Uww 1 WUw 


Plus 


4041318-4041431 


327085 


6531965 

www 1 www 


Plus 


4734947-4735069 


327036 


6531965 

UUU 1 3UU 


Plus 


319951-320040 


327130 


6531976 

Uww Iwf U 


Plus 


20247-22343 


327156 


5866841 


Minn? 


2462-2620 


327288 


5867481 

JUU * ^U 1 


Plus 


48583-48773 


327332 


5867516 

wUU l wlU 


Minus 


56361-56532 


327220 


5867525 

JUU/ J^_ J 


Minn? 

ivui iuo 


65701-65781 


327224 


5867534 

JUU/ JJt 


Plus 


188468-188544 


327321 


6249562 


Minus 


99745-99836 


327361 


6552412 


Minus 


61013-62130 


327396 

KJtm f WWW 


5867743 

WWW I 1 "w 


Plus 


8702-8820 


327414 


5867750 

JUU f t ww 


Plus 


102461-102586 


327442 


5867759 

juu/ r ww 


Plus 


111483-111618 


327467 


5867772 

OUW/ lie. 


Plus 


88030-88151 


327473 


5867775 

JUU f f I w 


Plus 


75101-75181 


327483 


5RR7783 


Plus 


181573-181662 


327377 


5867793 


Minn? 

fvilfiUO 


37610-37676 


327562 


5867804 


Minus 


343989-344474 


327568 


5867811 


Minus 


46152-46287 


327606 


6004463 


Plus 


200262-200495 


327611 


5867868 


Minus 


175063-175392 


327642 


5867891 


Minus 


2513-2743 


327654 


5867910 


Minus 


97564-97710 


327734 


5867940 


Minus 


31003-31583 



PCT/US02/12476 



150 



WO 02/086443 



PCT/US02/12476 



327775 


OODr 9b4 


Minus 


1 9.0/70.1 190071 


327796 


5867982 


Plus 


ob2b/-ob40b 


327840 


6249578 


Minus 


/oUbb-/o2Ub 


330208 


6013599 


Plus 


CCG17 CCQ01 

bbbl /-bb9dl 


330263 


6671 884 


Minus 


iaigao iaicq/I 
lU1bUd-1U1bo4 


328004 


b8fa/99d 


Minus 


■HZ7A(\7 1C70P7 

lb/4U7-lb/oo7 


328101 


5868020 


Plus 


opoqoa oortni/i 
2oyy2U-29UU14 


328100 


bobo02U 


Minus 


ZbdDAo-ZbJboD 


QOtH 1 Q 

328113 


eqcqao/i 
00D0U24 


Minus 


RA07P OA/IQ.1 


328157 


00D0UD4 


Pino 

rlUS 


f ooZO-MO IO 




OODOUOU 


Minus 


1 ODD I - 10 / zy 




bobbUol 


Minus 


4Z IOd-*tZ*ti30 


Q070J1 A 

oZ /y4U 


CQCQ1Q7 


Minus 


yoz4u-yo<*zo 


o2/yo4 


EQGQ01 G 

oobbzlb 


Plus 


ODD 1 1-uOOf / 


328021 


coaojuio 
by02482 


rlUS 


71Q/I7Q 71 /lean 
nd4/o-/i4byu 


328068 


G1 1 7P1fl 

611 /o19 


□flip 
rlUS 


Zbdyud-zb4Uzz 


328264 


COOH Q1 o 

6381912 


rlUS 


bbUob-bb4U4 


330300 


2905862 


Minus 


oo/ic ooao 
dZ4b-ddU2 


328608 


5868222 


Minus 


P777A Q70C9. 


328600 


5868229 


Minus 


qqqdo viaaia 
db 889-40010 


328616 


5868239 


Plus 


293920-294224 


328623 


rneoo Jiff 

5868246 


Minus 


ioaaoa ioaiog 
12UUZ0-1201Zb 


328632 


5868247 


Plus 


/b/d4-fbobd 


328666 


CQCOICil 

oobo2o4 


Minus 


77P am 


328698 


bob82b4 


Minus 


DZOuDb-bZOoJo 


328700 


58b82b4 


PIUS 


7R/inPQ 7R/J0ni 

/04Uoy-/b4ZUd 


OOQ7AQ 

328708 


CQC0071 

5868271 


Minus 


Do! l4-b0004 


328735 


5868289 


rlUS 


tjyooy-oy4bb 


328743 


CQCQOQO 

5868289 


rlUS 




328806 


5oboo24 


PIUS 


zy4Uo-zyboH 


328299 


EQCOQGC 

bbbBobb 


Minus 


1/1Q7HR 1/1QRPQ 

i4of uo- i^yooy 


328342 


5868383 


Dims 

Plus 


byybb-buuy4 


ooooee 

328365 


5868387 


Minus 


07fV70/l 0707QP 


328369 


5868388 


Plus 


(oof Wbboo 


328381 


5868392 


PIUS 


DbZrbo-bbzo4o 


328451 


5868425 


Minus 


O1707G 017QQR 


328481 


5868449 


Minus 


P0P7 Q1PA 


328500 


f-Q/jQ AC A 

bob84b4 


Plus 


oyuyo-by^oi 


328530 


COCO A QO 

58b8482 


Plus 


o*J4y/o-oob4Uo 


328664 


6004473 


Plus 


1 1 Q9.7Q.O. 1 1 QQQ£ 


OOOOC<4 

328861 


6381928 


Minus 


1AP017 ino/ino 


328908 


b8b84yo 


PIUS 




0000.0.0. 

32893J 


bobbbUU 


Din*. 

rlUS 


771 7CC 771 PRO 
/ f I / Ob-/ / lOOtf 


o2o9o4 


bbbbbUU 


Dlno 

rlUS 


PAK9/10 P/IR/t/lP 
040o4Z-0^0 t ^ £ *u 


328949 


645o7bb 


Minus 


4obbz-4obi y 


OOA01 0 

330313 


CA/IOAOA 
bU420oU 


Minus 


o«JD4z-oo/ /b 


329005 


5868542 


PIUS 


ob4f U-obb/o 


330366 


on a a i ac 

2944106 


PIUS 


ibioo/-ibiyi4 


330372 


6580495 


Minus 


*317y<C1 '517PPP 

ol/4bl-dl/boo 


329033 


5868561 


Minus 


conn c>l7Q 

bjyu-b4/y 


329037 


5868562 


Minus 


QOXCC 9.9CG.9 

oz4bb-o2bbz 


329067 


5868591 


Minus 


1/1C>117 1yl7CC5 

14b41 M4f bbz 


Aftn4 Oil 

329134 


5868679 


Plus 


2yyb9-ouui 8 


329157 


5868687 


Minus 


1 ACOAn 1 Ad< CC 

14ba40-14o1bb 


329178 


5868704 


Plus 


17Q177 l?©./!^^ 

i / yi / /-l / y4bd 


oonn no 

329192 


COCQ71 C 

bbbo/lb 


Plus 


looyjo-io/ uzu 




OOOOr to 


IVm tub 


OU*t*TUU i)U7v) J«7 


329204 


5868720 


Minus 


3050-3190 


329224 


5868728 


Plus 


27422-27664 


329228 


5868728 


Minus 


50118-50287 


329288 


5868771 


Plus 


25554-26299 


329337 


5868806 


Minus 


467155-467222 


329011 


6682532 


Plus 


48658-48741 



151 



WO 02/086443 PCT/US02/12476 

TABLE 9A: Potential Therapeutic, Diagnostic and Prognostic targets for Therapy of Lung Cancer 

Table 9A shows about 1312 genes up-regulated in lung tumors (including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid 
tumors) relative to normal body tissues. These genes were selected from about 59680 probesets on the Eos/Affymetrix Hu03 Genechip array. 

Table 9B show the accession numbers for those Pkey's lacking UnigenelD's for table 9A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Table 9C show the genomic positioning for those Pkey's lacking Unigene ID'S and accession numbers in table 9A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnlgenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Average of lung tumors {including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 

average of normal lung samples 

Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 
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2.02 


6.08 


100 


2100 


7.27 


25.00 


1.00 


100 


32.58 


7100 


21.28 


9.55 


68.83 


61.00 


1.83 


4.02 


1.42 


2.54 


1.00 


54.00 


24.18 


52.00 


3.21 


4.72 


38.63 


113.00 


62.88 


147.00 


2.35 


3.62 


10.84 


57.00 


3.18 


2.37 


2.89 


2.09 


2.02 


1.41 


1.29 


1.14 


142.99 


17.00 


1.41 


99.00 
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441128 


M570256 




ESTs, Weakly similar to T23273 hypotheti 


441290 


W27501 


Hs.89605 


cholinergic receptor, nicotinic, alpha p 


441362 


BE614410 


Hs.23044 


RAD51 (S. cerevisiae) homolog (E coli Re 


441377 


BE218239 


Hs.202656 


ESTs 


441390 


AI692560 


Hs.131175 


ESTs 


441497 


R51064 


Hs.23172 


ESTs 


441525 


AW241867 


Hs.1 27728 


ESTs 


441553 


AA281219 


Hs.121296 


ESTs 


441607 


NM_005010 


Hs.7912 


neuronal cell adhesion molecule 


441633 


AW958544 


Hs. 11 2242 


normal mucosa of esophagus specific 1 


441636 


M081846 


Hs.7921 


Homo sapiens mRNA; cDNA DKFZp566E183 (fr 


441737 


X79449 


Hs.7957 


adenosine deaminase, RNA-specific 


441790 


AA401369 


Hs.190721 


ESTs 


441801 


AW242799 


Hs.86366 


ESTs 


441919 


AI553802 


Hs.128121 


ESTs 


441937 


R41782 


Hs.22279 


ESTs 


441954 


AI744935 


Hs.8047 


Fanconi anemia, complementation group G 


442025 


AW887434 


Hs.11810 


CDA11 protein 


442029 


AW956698 


Hs.14456 


neural precursor cell expressed, develop 


442072 


A1740832 


Hs.12311 


Homo sapiens clone 23570 mRNA sequence 


442108 


AW452649 


Hs.166314 


ESTs 


442117 


AW664964 


Hs.1 28899 


ESTs 


442137 


AA977235 


Hs.1 28830 


ESTs, Weakly similar to Z192_HUMAN ZINC 


442159 


AW1 63390 


Hs.278554 


heterochromatin-like protein 1 


442179 


AA983842 


Hs.333555 


chromosome 2 open reading frame 2 


442328 


AI952430 


Hs.1 5061 4 


ESTs, Weakly similar to ALU4J-IUMAN ALU S 


442432 


BE093589 


Hs.38178 


hypothetical protein FU23468 


442530 


AI580B30 


Hs.176508 


Homo sapiens cDNA FU14712 fis, clone NT 


442547 


AA306997 


Hs.217484 


ESTs, Weakly similar to ALU1 HUMAN ALU S 


442556 


AL1 37761 


Hs.8379 


Homo sapiens mRNA; cDNA DKFZp586L2424 (f 


442619 


AA447492 


Hs.20183 


ESTs, Weakly similar to AF1 64793 1 prote 


442710 


AI015631 


Hs.23210 


ESTs 


442717 


R88362 


Hs.1 80591 


ESTs, Weakly similar to T23976 hypotheti 


442875 


BE623003 


Hs.23625 


Homo sapiens clone TCCCTA00142 mRNA sequ 


442914 


AW1 88551 


Hs,99519 


hypothetical protein FU 14007 


442932 


M45721 1 


Hs.8858 


bromodomaln adjacent to zinc finger doma 


442942 


AW1 67087 


Hs.1 31 562 


ESTs 


443068 


A11 88710 




ESTs 


443204 


AW205878 


Hs.29643 


Homo sapiens cDNA FLJ131G3 fis, clone NT 


443211 


AI128388 


Hs.1 43655 


ESTs 


443247 


BE614387 


Hs.333893 


c-Myc target JP01 


443324 


R44013 


Hs.164225 


ESTs 


443383 


AI792453 


Hs.1 66507 


ESTs 


443400 


R28424 


Hs.250648 


ESTs 


443426 


AF098158 


Hs.9329 


chromosome 20 open reading frame 1 


443572 


AA025610 


Hs.9605 


cleavage and polyadenylation specific fa 


443575 


AI078022 


Hs.269636 


ESTs, Weakiy similar to ALU INHUMAN ALU S 


443614 


AV655386 


Hs.7645 


fibrinogen, B beta polypeptide 


443633 


AL031290 


Hs.9654 


similar to pregnancy-associated plasma p 


443648 


A1085377 


Hs.143610 


ESTs 


443715 


AI583187 


Hs.97Q0 


cyclin E1 


443723 


A11 44442 


Hs.1 571 44 


syntaxin 6 


443802 


AW504924 


Hs.9805 


KIAA1291 protein 


443859 


NM_0134G9 


Hs.9914 


follistatin 


443892 


AA401369 


Hs.190721 


ESTs 


443947 


W24187 




gb:zb47f09.r1 Soares_fetaLlung_NbHL1 9W 


443991 


NM_002250 


Hs.10082 


potassium intermediate/small conductance 


444006 


BE395085 


Hs.1 0086 


type 1 transmembrane protein Fn14 


444009 


AI380792 


Hs.1 351 04 


ESTs 


444017 


U04840 


Hs.214 


neuro-oncological ventral antigen 1 


444127 


N63620 


Hs.1 3281 


ESTs 


444129 


AW294292 


Hs.256212 


ESTs 


444279 


U62432 


Hs.89605 


cholinergic receptor, nicotinic, alpha p 


444371 


BE540274 


Hs.239 


forkhead box Ml 


444378 


R41339 


Hs.1 2569 


ESTs 


444381 


BE387335 


Hs.283713 


ESTs, Weakly similar to S64054 hypotheti 


444461 


R53734 


Hs.25978 


ESTs, Weakly simitar to 2109260A B cell 


444471 


AB020684 


Hs.1 1217 


KIAA0877 protein 


444489 


A1151010 


Hs.1 57774 


ESTs 


444619 


BE538082 


Hs.8172 


ESTs, Moderately similar to A46010 X-iin 


444665 


BE613126 


Hs.47783 


B aggressive lymphoma gene 


444707 


AI188613 


Hs.41690 


desmocollin 3 


444735 


BE019923 


Hs.243122 


hypothetical protein FLJ 13057 similar to 


444781 


NMJM4400 


Hs.1 1950 


GPI-anchored metastasis-associated prote 


444783 


AK001468 


Hs.62180 


anillin (Drosophila Scraps homolog), act 


445236 


AK001676 


Hs.1 2457 


hypothetical protein FLJ 1081 4 


445258 


AI635931 


Hs.1 4761 3 


ESTs 


445413 


AA151342 


Hs.1 2677 


CGU147 protein 


445417 


AK001058 


Hs.12680 


Homo sapiens cDNA FLJ10196 fis, clone HE 


445443 


AV653838 


Hs.322971 


ESTs 


445462 


AA378776 


Hs.288649 


hypothetical protein MGC3077 


445517 


AF203855 


Hs.1 2830 


hypothetical protein 


445537 


AJ245671 


Hs.1 2844 


EGF-like-domain, multiple 6 


445580 


AF1 67572 


Hs.12912 


skbl (S. pombe) homolog 


445654 


X91247 


Hs.13046 


thioredoxin reductase 1 



4.13 


3.50 


1.00 


1.00 


130.23 


43.00 


22.03 


1.00 


3.65 


7.70 


1.00 


1.00 


1.53 


1.42 


1.89 


1.57 


1.47 


2.11 


216.22 


363.00 


2.31 


2.05 


1.30 


1.49 


44.15 


17.00 


1.00 


1.00 


1.00 


122.00 


0.86 


1.37 


1.48 


1.39 


1.00 


46.00 


9.92 


45.00 


25.05 


77.00 


3.61 


3.14 


3.00 


5.49 


1.00 


1.00 


1.92 


1.66 


27.22 


50.00 


5.00 


3.42 


181.59 


76.00 


10.59 


144.00 


109.23 


98.00 


1.00 


53.00 


29.02 


50.00 


1.00 


19.00 


1.00 


5.00 


22.85 


50.00 


25.33 


82.00 


3.18 


4.41 


8.45 


64.00 


1.00 


27.00 


1.00 


24.00 


12.42 


2.00 


128.84 


96.00 


0,02 


4.59 


1.00 


47.00 


18.52 


61.00 


4.02 


1.75 


2.98 


2.57 


1.00 


29.00 


1.00 


16.00 


1.00 


39.00 


39.81 


70.00 


48.74 


7.00 


1.29 


1.30 


1.75 


1.61 


1.35 


1.13 


1.00 


17.00 


1.33 


1.64 


5.71 


6.87 


1.47 


1.92 


1.00 


77.00 


1.00 


1.00 


1.00 


29.00 


1.00 


1.00 


0.60 


7.80 


2.91 


1.14 


1.00 


1.00 


469.00 


556.00 


12.88 


105.00 


24.91 


90.00 


1.00 


111.00 


1.00 


70.00 


30.56 


139.00 


1.00 


1.00 


77.02 


90.00 


1.57 


1.31 


77.55 


2.00 


1.00 


27.00 


1.00 


73.00 


28,14 


50.00 


1.81 


2.62 


1.00 


1.00 


2.09 


1.70 


1.87 


70.00 


1.71 


2.72 


1.52 


1.34 


1.51 
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445669 


A1570830 


Hs,174870 


ESTs 


10.95 


11.45 


445818 


BE045321 


Hs.136017 


ESTs 


1.00 


too 


445873 


AA250970 


Hs.251946 


poly(A)-binding protein, cytoplasmic 1-1 


49.42 


54.00 


445885 


A1734009 


Hs.1 27699 


KIAA1603 protein 


1.00 


132.00 


445898 


AF070623 


Hs.13423 


Homo sapiens clone 24468 mRNA sequence 


1.00 


1.00 


445903 


A1347487 


Hs.1 32781 


class I cytokine receptor 

Homo sapiens clone 24859 mRNA sequence 


1.00 


36.00 


445932 


BE046441 


Hs.333555 


2,41 


2.88 


445982 


BE410233 


Hs.1 3501 


pescadiilo (zebrafish) homolog 1, contai 


1.60 


1.35 


446078 


AI339982 


Hs. 156061 


ESTs 


1.00 


42.00 


446102 


AW1 68067 


Hs.317694 


ESTs 


1.00 


1.00 


446157 


BE270828 


Hs.131740 


Homo sapiens cDNA: FLJ22562 fis, clone H 


1.70 


1.53 


446269 


AW263155 


Hs.14559 


hypothetical protein FLJ 10540 


73.01 


48.00 


446292 


AF081497 


Hs.279682 


Rh type C glycoprotein 


1.55 


1.26 


446293 


AI420213 


Hs.149722 


ESTs 


1.00 


2.00 


446423 


AW139655 


Hs.150120 


ESTs 


1.10 


4.19 


446428 


AW082270 


Hs.12496 


ESTs, Weakly similar to ALU4_HUMAN ALU S 


0.53 


3.26 


446432 


AI377320 


Hs.150058 


ESTs 


1.00 


5.00 


446528 


AU076640 


Hs,15243 


nucleolar protein 1 (120kD) 


1.36 


1.31 


446574 


AI310135 


Hs.335933 


ESTs 


3.89 


72.00 


446619 


AU076643 


Hs.313 


secreted phosphoprotein 1 (osteopontin, 


32.03 


20.23 


446636 


AC002563 


Hs.1 5767 


citron (rho-interacting, serine/threonin 


4.19 


5.07 


446783 


AW1 38343 


Hs.141867 


ESTs 


2.82 


9.47 


446839 


BE091926 


Hs.1 6244 


mitotic spindle coiied-coil related prot 


110.28 


28.00 


446849 


AU076617 


Hs.16251 


cleavage and potyadenylation specific fa 


3.26 


2.94 


446856 


AI814373 


Hs. 1641 75 


ESTs 


6.38 


11.30 


446872 


X97058 


Hs.1 6362 


pynmid'mergic receptor P2Y, G-protein c 


1.98 


2.03 


446880 


A1811807 


Hs.1 08646 


Homo sapiens cDNA FU 14934 fis, clone PL 


94.90 


113.00 


446921 


AB012113 


Hs.1 6530 


small inducible cytokine subfamily A (Cy 


1.67 


3.90 


446989 


AK001898 


Hs.16740 


hypothetical protein FLJ11036 


2.82 


3.12 


447022 


AW291223 


Hs.1 57573 


ESTs 


1.00 


170.00 


447033 


A1357412 


Hs.1 57601 


ESTs 


7,15 


107.00 


447078 


AW885727 


Hs.9914 


ESTs 


47.24 


24.00 


447081 


Y13896 


Hs.17287 


potassium inwardly-rectifying channel, s 


0.12 


17.88 


447131 


NM_004585 


Hs.17466 


retinoic acid receptor responder (tazaro 


0.97 


1.48 


447149 


BE299857 


Hs.326 


TAR (HIV) RNA-binding protein 2 


1.24 


1.26 


447153 


AA805202 


Hs.315562 


ESTs 


1.00 


54.00 


447164 


AF026941 


Hs.17518 


Homo sapiens cig5 mRNA, partial sequence 


1.00 


67.00 


447178 


AW594641 


Hs.192417 


ESTs 


3.42 


50.00 


447250 


A1878909 


Hs.17883 


protein phosphatase 1 G (formerly 2C), ma 


1.60 


1,52 


447289 


AW247017 


Hs,36978 


melanoma antigen, family A, 3 


1.00 


1.00 


447342 


AH 99268 


Hs.19322 


Homo sapiens, Similar to RIKEN cDNA 2010 


28.63 


1.00 


447343 


AA256641 


Hs.236894 


ESTs, Highly similar to S02392 a!pha-2-m 


146.62 


51.00 


447350 


AI375572 


Hs.1 72634 


ESTs 


1.00 


12.00 


447377 


N27687 


Hs.334334 


transcription factor AP-2 alpha (activat 


2.55 


63.00 


447415 


AW937335 


Hs.28149 


ESTs, Weakly similar to KF3B.HUMAN KINES 


0.91 


1.13 


447425 


M983747 


Hs.18573 


acylphosphatase 1, erythrocyte (common) 


1.00 


35,00 


447519 


U46258 


Hs.339665 


ESTs 


59.89 


49.00 


447532 


AK000614 


Hs.1 8791 


hypothetical protein FU20607 


1.23 


1.63 


447534 


AA401369 


Hs.190721 


ESTs 


1.00 


17.00 


447636 


Y10043 




high-mobiKty group (nonhistone chromoso 


1.41 


1.11 


447688 


N87079 


Hs.1 9236 


Target CAT 


1.00 


39.00 


447733 


AF157482 


Hs.1 9400 


MAD2 (mitotic arrest deficient, yeast, h 


1.17 


1.12 


447769 


AW873704 


Hs.320831 


Homo sapiens cDNA FLJ 14597 fis, clone NT 


6.47 


5.95 


447802 


AW593432 


Hs.161455 


ESTs 


0.73 


2.34 


447850 


AB018298 


Hs.19822 


SEC24 (S. cerevisiae) related gene famil 


86.45 


116.00 


447924 


AI817226 


Hs.313413 


ESTs, Weakly similar to T231 1 0 hypotheti 


1.00 


1.00 


447973 


AB011169 


Hs.20141 


similar to S. cerevisiae SSM4 


3.50 


4.27 


448030 


N3Q714 


Hs.325960 


membrane-spanning 4-domains, subfamily A 


4.13 


142.00 


448105 


AI538613 


Hs.298241 


Transmembrane protease, serine 3 


1.15 


2.24 


448243 


AW369771 


Hs.52620 


integrin, beta 8 


15.84 


1.00 


448278 


W07369 


Hs.1 1782 


ESTs 


0.97 


1.90 


448290 


AK002107 


Hs.20843 


Homo sapiens cDNA FLJ 11 245 fis, clone PL 


1.00 


1.00 


448296 


BE622756 


Hs.10949 


Homo sapiens cDNA FLJ 14162 fis, clone NT 


2.42 


2.17 


448357 


BE274396 


Hs.108923 


RAB38, member RAS oncogene family 


1.44 


1.08 


448390 


AL035414 


Hs.21068 


hypothetical protein 


1.00 


43.00 


448489 


AW504732 


Hs.21275 


hypothetical protein FLJ11011 


2.63 


2.49 


448569 


BE382657 


Hs.21486 


signal transducer and activator of trans 


1.84 


2.53 


448663 


BE614599 


Hs.1 06823 


hypothetical protein MGC14797 


3.29 


46.00 


448672 


AI955511 


Hs.225106 


ESTs 


1.00 


21.00 


448733 


NM_005629 


Hs.187956 


solute carrier family 6 (neurotransmitte 


1.82 


1.08 


448741 


BE614567 


Hs.1 9574 


hypothetical protein MGC5469 


2.48 


1.92 


448757 


AI366784 


Hs.48820 


TATA box binding protein (TBP)-associate 


23.53 


20.00 


448775 


AB025237 


Hs.388 


nudix (nucleoside diphosphate linked moi 


2.34 


1.97 


448826 


AI580252 


Hs.293246 


ESTs, Weakly similar to putative p150 [H 


74.07 


62.67 


448830 


AL031658 


Hs.22181 


hypothetical protein dJ310O13.3 


1.37 


1.31 


448844 


AI581519 


Hs.177164 


ESTs 


1.00 


31.00 


448988 


Y09763 


Hs.22785 


gamma-aminobutyric acid (GABA) A recepto 


1.84 


1.95 


448993 


AI471630 




KIAA0144gene product 


1.63 


1.49 


449003 


X76342 


Hs.389 


alcohol dehydrogenase 7 (class IV), mu o 


1.00 


1.00 


449029 


N28989 


Hs.22891 


solute carrier family 7 (cationic amino 


1.97 


2.26 


449040 


AF040704 


Hs.149443 


putative tumor suppressor 


0.97 


1.56 


449048 


Z45Q51 


Hs.22920 


similar to S68401 (cattle) glucose indue 


27.13 


90.00 


449053 


AI625777 


Hs.344766 


ESTs 


8.33 


44.00 


449054 


AF148848 


Hs.22934 


myoneurin 


73.85 


104.00 


449101 


AA205847 


Hs.23016 


G protein-coupled receptor 


2.58 


27.00 
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449523 
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AI034339 AW674593 N721 56 AI079733 AI038683 AI291 61 6 M491 599 AA993675 AA837380 BE006554 BE006473 AI087090 T33044 
AA652043 AI203503 AA583959 W35283 AI129926 Z41844 AW020925 AW575848 A1684603 AA493297 AI140689 AI277175 AA425444 
AI932767 W02632 BE396786 R37261 
AW207206 AW341473 M448195 AI951341 

AA249027 AL038984 AK001993 AL080066 AV652725 BE566226 AA345557 AA315222 AA090585 AA375688 AA301092 AA298454 W05762 
AW607939 H51658 D83880 N84323 BE296821 AW947007 D61461 AW079261 AA329482 AW90 1780 Al 354442 AA772275 R31663 A1354441 
AI767525 H92431 AI916735 H93575 AI394255 AW014741 AI573090 C06195 AW612857 AW265195AI339558AI377532 AI308821 AI919424 
AI589705 AW05521 5 AI336532 AI338051 AA806547 C75509 C0061 8 AW071 1 72 AW769904 AA630381 AI67801 8 A1863985 D79662 BE221 049 
AW265018 AI589700 AW196655 N76573 AI370908 BE042393 N75017 AI698870 AW960115 
AL1 33561 AL041090AL1 17481 AL1 22069 AW439292 Al 968826 

AW07291 6 Al 1 8491 3 M4891 95 AW466994 AW469044 N59350 AI81 9642 AI280239 AI220572 AA789302 AI47361 1 AW841 1 26 D60937 
BE041395 AA491826 AA621946 M715980 M666102 
AW970622 AA503009 AA502998 AA502989 AA502805 T92188 

M221036 R87170 BE537068 BE544757 C18935 AW812058 T92565 AA227415 M233942 AA223237 AA668403 AA601627 AW869639 
BE061 833 BE000620 AW961 1 70 AW84751 9 AA308542 AW821 833 AW945688 C04699 AA205504 AA377241 AW821 667 AA055720 
AW817981 AW856468 AA155719 AA179928 T03007 AW754298 AA227407 AM 13928 AA307904 C16859 

AI798376 S46400AW811617AW811616 W00557 BE 142245 AW858232 AW861851 AW858362 M232351 AA218567M055556 AW858231 
AW857541 AW814172 H66214 AW814398 AF1 341 64 AA243093 AA1 73345 M199942 AA223384 AA227092 AA227080 T1 2379 AA0921 74 
T61139M149776AA699829AW879188AW813567AW813538AI267168AA157718AA157719AA100472AA100774AA130756AA157705 
M157730 AA157715AA053524AW849581 AW854566 C05254 AW882836 T92637 AW8 12621 AA206583 AA209204 BE1 56909 AA226824 
AI829309 AW991 957 N66951 AA527374 H6621 5 AA045564 AI694265 H60808 AA149726 AW1 95620 BE081 333 BE073424 AW817662 
AW817705 AW817703 AW817659 BE081531 H59570 
AA628980 AI126603 BE504035 
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439000 
439285 

439780 
441128 



454241 
455175 
456237 



46771 6_1 
47065J 

47673.1 
51021_2 



443068 558874J 
443947 5861 60_1 
447636 7301.1 



448993 79225.1 



449305 804424J 
451105 859083J 
451320' 86576J 



451807 8865 J 
452410 9163.1 



1067807.1 
1257335.1 
168730.1 
47395.1 



PCT/US02/12476 

AW373062T55662 AI299190 BE174210 AW579001 H01811 W40186 R67100 AI923886 AW952164 AA628440 AW898607AW898616 
M7091 26 AW898628 AW898544 M947932 AW898625 AW898622 AI276125 A11 85720 AW510698 AA987230 T52522 BE467708 AW243400 
AW043642 AI288245 AI186932 D52654 D55017 D52715 D52477 D53933 D54679AI 298739 Al 1 46984 A1922204 N98343 BE174213 AA845571 
AI813854 AI21451 8 AI635262 AI139455 AI707807 A1698085 AW884528 AI024768 AI004723 AW087420 AI565133 N94964 AI268939 
AW51 3280 AI061 126 AI43581 8 AI8591 06 AI360506 AI024767 AA513019 AA757598 X56196 M902959 A1334784 AI860794 AA01 0207 
AW890091 AW51 3771 A1951 391 A1337671 T52499 AA890205 AI640908 H75966 M463487 M358688 AI961 767 AI866295 AA780994 
AI985913BE174196AA029094AW592159T55581 N79072 AI611201 AA910812 AI220713AW149306 AI758412 AA045713 R79750 N76096 
AW9791 21 AA847986 AA829098 

AL133916 N79113AF086101 N76721 AW950828AA364013 AW955684 AI346341 AI867454 N54784A1 655270 Al 421 279 AW0 14882 
AA775552 N62351 N 59253 AA626243 AI341407 BE 1 75639 M456968 A1358918 AA457077 
AL 109688 R23665 R26578 

M570256 AW014761 AA573721 AI473237 AI022165 M554071 AA127551 N90525 AW973623 AA447991 AA243852 BE328850 AI148171 
A1359627AI005068 AI356567 M232991 AW01 6855 M90 6902 M2331 01 AA1 27550 BE512923 
AI188710AI032142 AW078833 N30308 AW675632 AI21 9028 AI341 201 N22181 H95390 
W24187 W24194 R17789 

Y10043 NM 005342 L05085 AL034450 BE614226 AW749053 AA379173 AA248230 BE514634 AA334622 R70656 AA367593 AA214649 
AA369318 AW957081 R05760 AA039903 AI886597 AW630122 AA906264 AA041527 R01145 AI088688 BE463637 AA398795 AI354883 
AI768938 AI569996 AI452952 AI168582 AI189869 AI086670 AW262560 AW613854 AA862839 AA435840 M670197 AI024032 AI990659 
AI990089 N81095 AA847919 AW9601 50 M21 1075 M044704 AA367594 AW582587 AW858854 AW81 8630 AW818281 AW81 8433 AW582595 
AA096002 N83992 

AI471630 BE540637 BE265481 AW407710 BE513882 BE546739 AA053597 BE140503 BE218514 AW956702 AI656234 AI636283 AI567265 
AW340858 BE207794 AA053085 R69173 AA292343 M454908 M293504 AI659741 AI927478 AA399460 AI760441 AA346416 BE047245 
AA730380 AA394063 AA454833 AI982791 A1567270 AI81 3332 AI767858 AA427705 D20284 AI221 458 BE048537 A1263048 AA34641 7 
M911497 BE537702 
AI638293AW813561 
A1761324AW880941 AW880937 

AW118072AI631982T15734AA224195A1701458W20198 F26326 AA890570 N90552AW071907 AI671352 A1375892 T03517 R88265 
AI124088 AA224388 AI08431 6 AI354686 T33652 AI140719 AI72021 1 T03490 AI372637 T15415 AW205836 AA630384 T03515 T33230 
AA017131 AA443303T33623 AI222556T33511 T33785 AI41 9606 D55612 

W52854AL1 17600 BE208116 BE208432 BE206239 BE082291 AW953423 AA351619 BE180648 BE1 40560 W60080 AA865478 N90291 
AW450652 AW44951 9 AA993634 AI806539 AA351618 AW449522 AI827626 AA904788 AA380381 AA886045 AA774409 BE003229 Z41756 
AL133619 AA468118 AA383064 A1476447 T09430 AI673758 M524895 AI581345 AI300820 AW498812 M256162 AI559724 AI685732 
AA602400 AA905453 AI204595 AW1 66541 AA1 57456 AA1 56269 AA383652 AA431 072 AW592707 AI43541 0 AW272464 AI21 5594 AA622747 
R74039N35031 A1804128 AW513621 AA868351 A1026826 A1493388 AA614641 W81 604 Al 567080 AI214351 M730140 AI125754 AI200813 
AI269603 AI565082 A1807095 AI476629 AA505909 AI368449 AI686077 AI582930 AW085038 AA757863 AA7301 54 AI767072 AA46831 6 
AI734130 AI734138 M426284 AA433997 AI741241 AW043563 AI732741 AI732734 M437369 M425820 AA664048 R74130 
BE144666 BE184942 AW238414 BE184946 
AW993247AW861464 
AA203682R11958 

BE550224AA832519 N45402 AW885857 N29245 BE465409 W07677 AW970089 AI299731 AA482971 BE503548 H18151 W79223 AF086393 
AA461301 W74510 R34182 AI090689 N46003 BE071550 R28075 AW134982 AI240204 AI138906 AW026179 A1572316 BE466182 AI206395 
A1276154AI273269 AI422817 AI371014 AI421274 A1188525 AA939164 BE549810 AW1 37865 AI694996 BE503841 AA459718 BE327407 
BE467534 BE218421 BE467767 M989054 BE467063 AI797130 BE327781 



TABLE 9C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication entitled The DNA 

sequence of human chromosome 22." Dunham I. etal., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 
NLposition: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposition 


400512 


9796593 


Minus 


1439-1615 


400517 


9796686 


Minus 


49996-50346 


400560 


9843598 


Plus 


94182-94323,97056-97243,101095-101236,102824-103005 


400664 


8118496 


Plus 


13558-13721,13942-14090,14554-14679 


400665 


8118496 


Plus 


16879-17023 


400666 


8118496 


Plus 


17982-18115,20297-20456 


400749 


7331445 


Minus 


9162-9293 


400763 


8131616 


Minus 


35537-35784 


401027 


7230983 


Minus 


70407-70554,71060-71160 


401093 


8516137 


Minus 


22335-23166 


401203 


9743387 


Minus 


1 7296 1 -1 73056, 1 73868-1 73928 


401212 


9858408 


Plus 


87839-88028 


401411 


7799787 


Minus 


144144-144329 


401435 


8217934 


Minus 


54508-55233 


401464 


6682291 


Minus 


170688-170834 


401714 


6715702 


Plus 


96484-96681 


401747 


9789672 


Minus 


118596-118816,119119-119244,119609-119761,120422-120990,130161-130381,130468-130593,131097-131258,131866- 








131932,132451-132575,133580-134011 


401760 


9929699 


Plus 


831 26-83250,85320-85540,9471 9-95287 


401780 


7249190 


Minus 


28397-28617,28920-29045,29135-29296,29411-29567,29705-29787,30224-30573 


401781 


7249190 


Minus 


83215-83435,83531-83656,83740-83901,84237-84393,84955-85037,86290-86814 


401785 


7249190 


Minus 


165776-165996,166189-166314,166408-166569,167112-167268,167387-167469,168634-168942 


401797 


6730720 


Plus 


6973-7118 


401961 


4581193 


Minus 


124054-124209 


401985 


2580474 


Plus 


61542-61750 


401994 


4153858 


Minus 


42904-43124,43211-43336,4460744763,4519945281,46337-46732 


402075 


8117407 


Plus 


121907-122035,122804-122921,124019-124161,124455-124610,125672-126076 


402260 


3399665 


Minus 


1 13765-1 13910,1 15653-115765,116808-1 16940 


402265 


3287673 


Plus 


21059-21168 


402297 


6598824 


Plus 


35279-35405,35573-35659 


402408 


9796239 


Minus 


110326-110491 
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4U242U 


070C000 

9/yb339 


Plus 


129/ ou-i2yyiy 


402674 


OA77 4 AO 

8077108 


Minus 




4023U2 


Odo/AOO 


Minus 


03242-03432 


/A. AAA.* 

4U29y4 


299bb4o 


Minus 


4/ Zi-4yo9 


403137 


y^TT4»4 


Minus 


QOOjIO Q9K79 0.00 QOftQA QOK7Q Q0719 GOQAQ_GAfY79 QA7AP. 0^91 A_Q£*W7 

y234y-92o/ 2,y/yoo-y3Ua4,yoo/y-»on ^yoy^y-y^u/ <£,y*toy i-y^/*K>,yoz i**-yooo/ 


4U33U6 


ouyyy4o 


rJUS 


t i r i7i(\C\ 1979K1 
12/1 UU-1 2/201 


403329 


on 4 c <i on 


Plus 


ORACH OCCQQ 

yo4ou-yooyo 


4u33ol 


Q JOQOC7 

9400/0/ 


Minus 


ocaao oc47ft 

zouuy-zoi / o 


403478 


nncooco 

9958250 


Plus 


116458-110564 


403485 


9955528 


Plus 


OQQQ OAfH 0<tOQ OC09 OCCC /f"M7 

2oo8-30Ul»3iyo-3032,30OO-4 11 / 


403627 


0569879 


Minus 


23ob 0-24342 


40371 5 


7239669 


Plus 


OF4 00 OCOfiO 

85128-65292 


404044 


9558573 


Minus 


OOC7T7 OOCtVSQ 

225/0 / -225939 


404076 


9931 752 


Minus 


3o4o-390/ 


404101 


QA.7CAOC 

oUf092o 


Minus 


125/42-1 2099/ 


404140 


AD A OCOA 

9843520 


Pius 


077C"I OQ4./I7 

37751-3814/ 


4U41bO 


yy2b4o9 


Minus 


o9U2o-oyi2o 


404185 


4572584 


Minus 




404210 


0UU024b 


Pius 


ioyy2o-i /ui/1 


404253 


ooc70no 

yob/2U2 


MinuS 


OOD/O-OoUOO 


404287 


2320014 


rlUS 


031 34-03-40 1 


404298 


em A /toco 
9944200 


Minus 


7QCO-J 70700 

/3091-/3/23 


404347 


9838195 


Plus 


74493-74829 


404440 


7528051 


Plus 


80430-81 581 


404721 


AQCCC A O 

9856648 


Minus 


17O7C0 47A1QA 

1/3/03- 1 /4294 


404794 


4o2o439 


Plus 


lU1bly-lUlo9o 


404854 


7143420 


Plus 


■I yloen -1 AC07 
142bU-14o3/ 


404877 


1519284 


Plus 


lUy 0-210/ 


404927 


7342002 


Plus 


oob90-b9ob3 


A/\ Anne* 

404996 


6007890 


Pius 


0700D OQ4AE ODCCO OQQQQ 0O707 00079 AAKR7 y)flft7.A A90K1 AOARfi 

3/999-3ol4o,3obo2-3oyyo,3y/2/-3yo7 2 r 4U00/-4U0/4 f 4^00 


405449 


7622497 


Pius 


AOOOC JlOC7f\ 

42230-420 /U 


405568 


6006906 


Pius 


3591 2-obObo 


405572 


3800891 


Plus 


OO230-oo93o 


405646 


4914350 


Pius 


741-969 


4UO0/O 


*k>o/uo/ 


Pino 


701 Qc 70Q17 
/ O 1 OO- / OS < / 


405770 


2735037 


Plus 


61057-62075 


405932 


7767812 


Minus 


123525-123713 


406137 


9166422 


Minus 


30487-31056 


406360 


9256107 


Minus 


7513-7673 


406399 


9256288 


Minus 


63448-63554 


406467 


9795551 


Plus 


182212-182958 



TABLE 10A: Potential Therapeutic, Diagnostic and Prognostic targets for Therapy of Lung Cancer and Non-malignant Lung Disease 

Table 2A shows about 307 genes up-regulated in non-malignant lung disease relative to lung tumors and normal body tissues and/or down-regulated in iung tumors relative to 
normal lung and non-maiignant lung disease. These genes were selected from about 59680 probesets on the Eos/AftVmetrix Hu03 Genechip array. 

Table 108 show the accession numbers for those Pkey's lacking UnigenelD's for table 10A, For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (OoubteTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Table 10C show the genomic positioning for those Pkey's lacking Unigene ID's and accession numbers in table 10A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: \ Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1; Average of lung tumors (Including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of normal lung samples 

R2: Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 



Pkey 


ExAccn 


UnigenelD 


Unigene Title 


R1 


R2 


404394 




ENSP00000241075:TRRAP PROTEIN. 


0.79 


3,10 


404916 






Target Exon 


too 


159.00 


405257 






Target Exon 


1.00 


422.00 


407228 


M25079 


Hs.155376 


hemoglobin, beta 


0.47 


2.33 


407568 


AA740964 


Hs.62699 


ESTs 


1.00 


123.00 


408562 


AI436323 


Hs.31141 


Homo sapiens mRNA for KIAA1568 protein, 


1.00 


230.00 


409031 


AA376836 


Hs.76728 


ESTs 


1.00 


128.00 


410434 


AF051152 


Hs.63668 


toll-like receptor 2 


39.65 


149.00 


410467 


AF102546 


Hs.63931 


dachshund (Drosophila) homolog 


1.00 


109.00 


410808 


T40326 


Hs.167793 


ESTs 


1.14 


13.14 


412351 


AL135960 


Hs.73828 


T-cell acute lymphocytic leukemia 1 


0.37 


2.27 


412372 


R65998 


Hs.2B5243 


hypothetical protein FU22Q29 


1.00 


173.00 


413795 


AL040178 


Hs.142003 


ESTs 


0.10 


11.90 


414154 


AW205314 


Hs.323060 


ESTs 


0.62 


2.09 


414214 


D49958 


Hs.75819 


glycoprotein M6A 


0.03 


4.55 


414998 


NMJJ02543 


Hs.77729 


oxidised low density lipoprotein {lectin 


0.64 


2.97 


415122 


D60708 


Hs.22245 


ESTs 


0.07 


8.97 


415765 


NM_005424 


Hs.78824 


tyrosine kinase with immunoglobulin and 


0.67 


1.65 


415775 


H00747 


Hs.29792 


ESTs, Weakly similar to I38022 hypotheti 


0.29 


2.64 


415910 


U20350 


Hs.78913 


chemokine (C-X3-C) receptor 1 


1.00 


145.00 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



416319 


AI8 15601 


Hs.79197 


416402 


NMJM30715 


Hs.1012 


417355 


D13168 


Hs.82002 


417421 


AL1 38201 


Hs.82120 


417511 


AL049176 


Hs.82223 


418489 


U76421 


Hs.85302 


418726 


BE241812 


Hs.87860 


418741 


H83265 


Hs.8881 


418883 


BE387036 


Hs.1211 


419086 


NM.000216 


Hs.89591 


419150 


T29618 


Hs.89640 


419235 


AW470411 


Hs.288433 


419407 


AW410377 


Hs.41502 


420556 


AA278300 


Hs. 124292 


420656 


AA279098 


Hs.1 87636 


420729 


AW964897 


Hs.290825 


421177 


AW070211 


Hs.102415 


422060 


R20893 


Hs.325823 


422426 


W79117 


Hs.58559 


422652 


AW967969 


Hs. 118958 


423099 


NMJ302837 


Hs.1 23641 


424433 


H04607 


Hs.9218 


424585 


M464840 


Hs.1 31987 


424711 


NM 005795 


Hs.152175 


424973 


X92521 


Hs.1 54057 


425023 


AW956889 


Hs.1 54210 


425664 


AJ006276 


Hs.159Q03 


425998 


AU076629 


Hs.1 65950 


426657 


NM 015865 


Hs.1 71731 


426753 


T89832 


Hs.170278 


427558 


D49493 


Hs.2171 


427983 


M17706 


Hs.2233 


428467 


AK002121 


Hs.1 84465 


428927 


AA441837 


Hs.90250 


429496 


AA453800 


Hs.1 92793 


430468 


NMJ304673 


Hs.241519 


431385 


BE1 78536 


Hs.1 1090 


431728 


NMJ)07351 


Hs.268107 


431848 


AI378857 


Hs.1 26758 


432128 


AA1 27221 


Hs.117037 


432519 


AI221311 


Hs.1 30704 


433043 


W57554 


Hs.1 2501 9 


433803 


A1823593 


Hs.27688 


434730 


AA644669 


Hs.1 93042 


435472 


AW972330 


Hs.283022 


436532 


AA721522 




437119 


AI379921 


Hs.177043 


437140 


AA312799 


Hs.283689 


437211 


AA382207 


Hs.5509 


437960 


A1669586 


Hs.222194 


438202 


AW1 69287 


Hs.22588 


438873 


AI302471 


Hs.1 24292 


438875 


AA827640 


Hs.189059 


441048 


AA913488 


Hs.192102 


441188 


AW292830 


Hs.255609 


441499 


AW298235 


Hs. 101689 


444513 


AL1 20214 


HsJ117 


444527 


NM 005408 


Hs.1 1383 


444561 


NMJJ04469 


Hs.11392 


445279 


R41900 


Hs.22245 


446017 


N98238 


Hs.55185 


446984 


AB020722 


Hs.16714 


446998 


N99013 


Hs.16762 


447357 


AI375922 


Hs.1 59367 


448106 


A1800470 


Hs.171941 


448253 


H25899 


Hs.201591 


449275 


AW450848 


Hs.205457 


450400 


AI694722 


Hs.279744 


450696 


AI654223 


Hs.1 6026 


450726 


AW204600 


Hs.250505 


451497 


H83294 


Hs.284122 


451533 


NM 004657 


Hs.26530 


453636 


R67837 


Hs.1 69872 


458332 


AI000341 


Hs.220491 


459580 


AA022888 


Hs.176065 


400269 






403421 






407570 


Z19002 


Hs.37096 


412295 


AW088826 


Hs.117176 


414517 


M24461 


Hs.76305 


417204 


N81037 


Hs.1074 


418307 


U70867 


Hs.83974 


418935 


T28499 


Hs.89485 


421502 


AF111856 


Hs.105039 


421798 


N74880 


Hs.29877 



CD83 antigen (activated B lymphocytes, i 15.32 237.00 

complement component 4-binding protein, 0.64 4.00 

endothelin receptor type B 0,01 3.90 

nuclear receptor subfamily 4, group A, m 36.30 357.00 

chordin-like 1.00 179.00 

adenosine deaminase, RNA-specific, B1 (h 0.02 6.00 

protein tyrosine phosphatase, non-recept 1 .00 11 3.00 

ESTs, Weakly similar to S41044 chromosom 0.44 1.90 

acid phosphatase 5, tartrate resistant 0.96 2.04 

Kallmann syndrome 1 sequence 0.62 2.74 

TEK tyrosine kinase, endothelial (venous 0.03 6.90 
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hypothetical protein FLJ 10392 
KIAA0758 protein 

purine-rich element binding protein A 

ESTs 

ESTs 

ESTs, Weakly similar to KIAA1324 protein 
DKFZP564D206 protein 
novel SH2-containing protein 3 
cartilage acidic protein 1 
ESTs 

purine-rich element binding protein A 
epithelial membrane protein 2 
ESTs 

vanilloid receptor-like protein 1 

Homo sapiens cDNA FLJ11422 fis, clone HE 

ESTs 

ESTs, Weakly similar to JC5795 CDEP prot 
gb:CM2-HT0342-091299-050-b05 HT0342 Homo 
up-regulated by BCG-CWS 
Homo sapiens, clone MGC: 16327, mRNA, com 
cathepsin Z 

ESTs, Weakly similar to ALU4JHUMAN ALU S 
gb:HSC1KA072 normalized infant brain cDN 



TABLE 1 0B 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



1.00 


QA AA 
04.UU 


0.02 


4.38 


1.00 


97.00 


0.93 


1.69 


1.00 


106.00 


0.40 


47.20 


1.00 


100.00 


0.05 


Q fH 
O.^l 


0.02 


C AO 


i aa 
1.00 


/y.UU 


0.42 


1.00 


a An 
U.l/ 


11. do 


1.00 


dA AA 

cf4.UU 


1.00 


Q1 AA 

yi.uu 


i aa 
l.UU 


1 CI AA 
1 OZ.UU 


4 AO 
l.UU 


OR AA 
Ou.UU 


0.60 


1 QA 

l.oU 


A CA 

U.54 


1.31 


1.00 


C7 AA 

b/.UU 


A CI 

4.53 


H A7 

n.u/ 


0.72 


1 OA 

2.24 


1,00 


CtQ AA 
OO.UU 


0.83 


1 7A 


i AA 

l.UU 


1*39 Art 
lOZ.UU 


4 AA 
l.UU 


70 AA 
f Z.UU 


1 no. 


68.00 


0.57 


2.89 


1.00 


82.00 


0.79 


1.96 


1.03 


3.25 


1.00 


113.00 


1.00 


544.00 



Pkey 
408074 

411667 
413533 

423387 



423696 

430212 
436532 
453531 
454741 



CAT Number 
103684J 

1253334J 
1375344.1 

22779,1 



231 12J 

314437.1 
421802.1 
97026J 
1232559 J 



Accession 

R20723 AA263003 AA333976 M334725 AA334151 AW965490 M310513 AI810530 D31302 AW134897 AA830127 AA046953 A1668930 
C06094AW104534 

BE160198 AW935898 T11520 AW935930 AW856073 AW861034 

BE146973 BE146972 BE147042 BE147018 BE146783 BE147020 BE146781 BE147019 BE146766 BE147021 BE146952 BE146767 BE147044 
BE146797 BE146776 BE146985 BE146793 BE146768 BE146771 BE146954 BE146760 BE147048 BE147025 BE147030 
AJ012074 U11087 L13288 X75299 L20295 AW630780 H14880 T28037 A1872991 R72136 AW449839 T81622 T79697 T29519 R94105 T83923 
R73300 AI797007 R73390 AA961010 H74168 A1689932 BE045543 AI808418 AI608912 AI806573 AW884084 AW872978 AW872985 AA565655 
AI022915 R50647 R73210 H45098 R46451 AW166269 T71132 AI264547 R52146 AI304920 R73391 AW884059 AW884085 H73241 T60038 
T79612 R731 45 R50549 AI094557 AI668793 R72302 AI564366 W01 956 M41 8962 W32571 R72840 H45409 R72085 R46356 R46758 
AA508805 AA418798 T83751 R94072T16182AA928785 AA903896 

Z92546 AA330586 AI570568 AW341487 AI827050 AW298668 AI792189 AI015693 AI733599 AI572251 AI672488 AW1 93262 A1244716 

AI864375 AI206100 AA91 2444 AI269365 A1640254 AW772466 AI867336 AA627604 H1 6914 AA358477 M338009 

M4691 53 AI71 8503 M469225 

AA721522AW975443T93070 

AA417940AA036735T07025 

BE154396 AW817959 BE154393 



TABLE 10C 



Pkey; Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gi) numbers. "Dunham I. et al." refers to the publication entitled "The DNA 

sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 
NLposition: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposition 


400754 


7331445 


Plus 


144559-144684 


401045 


8117619 


Plus 


90044-90184,91111-91345 


401083 


3242744 


Plus 


33192-33360 


402474 


7547175 


Minus 


53526-53628,55755-55920,57530-57757 


402808 


6456148 


Minus 


114964-115136,115461-115585,115931-116047,117666-117771,118004-118102 


403021 


7547270 


Plus 


120799-120966 


403421 


9665041 


Minus 


1 26609-1 26773, 1 39986-1 40205 


403438 


9719679 


Plus 


90792-90938 


403687 


7387384 


Plus 


9009-9534 


403764 


7717105 


Minus 


118692-118853 


404277 


1834458 


Minus 


91665-91946 


404288 


2769644 


Plus 


3512-3691 


404394 


3135305 


Minus 


37121-37205,37491-37762,4105341140,41322-41593,41773-41919 


404518 


8151988 


Pius 


84494-84603 


404916 


7341826 


Plus 


91057-91188 


405106 


8079395 


Minus 


80877-81418 


405257 


7329310 


Plus 


73121-73273 


405381 


6006920 


Minus 


7636-8054 
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406387 9256180 Plus 116229-116371,117512-117651 
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TABLE 11 A: Genes Distinguishing Adenocarcinoma from Other Lung Diseases and Normal Lung 

Table 11A shows about 84 genes upregulated in lung adenocarcinomas relative to other lung tumors, non-malignant lung disease, and normal lung. These genes were selected 
from about 59680 prabesets on the Eos/Affymetrix Hu03 Genechip array. 

Table 1 1 B show the accession numbers for those Pkey's lacking UnigenelD's for table 1 1 A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed In the 
"Accession" column. 

Table 1 1C show the genomic positioning for those Pkey's lacking Unigene ID's and accession numbers in table 1 1A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Average of lung tumors (including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of normal lung samples 

R2; Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 



Pkey 


ExAccn 


UnigenelD 


Unigene Title 


R1 


R2 


403329 






Target Exon 


1.00 


61.00 


406399 






NM_003122*:Homo sapiens serine protease 


1.00 


39.00 


406690 


M29540 


Hs.220529 


carcinoembryonic antigen-related cell ad 


226.37 


350.00 


407869 


A1827976 


Hs.24391 


hypothetical protein FLJ13612 


0.77 


1.18 


407881 


AW072003 


Hs.40968 


heparan sulfate (glucosamine) 3-O-sulfot 


1.00 


10.00 


408908 


BE296227 


Hs.250822 


serine/threonine kinase 15 


7.76 


1.00 


409103 


AF251237 


Hs.1 12208 


XAGE-1 protein 


80.44 


40.00 


409187 


AF154830 


Hs.50966 


carbamoyl-phosphate synthetase 1, mitoch 


1.00 


1.00 


409269 


AA576953 


Hs.22972 


hypothetical protein FU13352 


1.00 


1.00 


410076 


T05387 


Hs.7991 


ESTs 


1.12 


1.50 


410102 


AW248508 


Hs.279727 


Homo sapiens cDNA FLJ 14035 fis, clone HE 


9.89 


1.00 


410399 


8E068889 




synuclein, gamma (breast cancer-specific 


0.92 


1.06 


411908 


L27943 


Hs.72924 


cytidine deaminase 


1.00 


1.00 


412612 


NM 000047 


Hs.74131 


arylsulfatase E (chondrodysplasia puncta 


1.02 


1.03 


414075 


U11862 


Hs.75741 


amiloride binding protein 1 (amine oxida 


0.84 


1.07 


416208 


AW291168 


Hs.41295 


ESTs, Weakly similar to MUC2_HUMAN MUCIN 


3.67 


1.00 


417542 


J04129 


Hs.82269 


progestagen-associated endometrial prate 


1.28 


1.35 


419183 


U60669 


Hs.89663 


cytochrome P450, subfamily XXIV (vitamin 


1.00 


1.00 


419502 


AU076704 




fibrinogen, A alpha polypeptide 


13.05 


115.00 


419631 


AW188117 


Hs.303154 


popeye protein 3 


1.00 


13.00 


420931 


AF044197 


Hs.100431 


small inducible cytokine B subfamily (Cy 


1.00 


8.00 


421155 


H87879 


Hs. 102267 


lysyl oxidase 


1.00 


15.00 


421190 


U95031 


Hs. 102482 


mucin 5, subtype B, tracheobronchial 


1.17 


1.55 


421474 


U76362 


Hs. 104637 


solute carrier family 1 (glutamate trans 


1.46 


1.76 


421515 


Y11339 


Hs.105352 


GalNAc alpha-2, 6-sialyltransferase ), 1 


1.00 


3.00 


421582 


AI910275 




trefoil factor 1 (breast cancer, estroge 


1.23 


1.00 


422026 


U80736 


Hs.1 10826 


trinucleotide repeat containing 9 


1.00 


52.00 


422095 


AJ868872 


Hs.282804 


hypothetical protein FLJ22704 


4.37 


2.34 


422311 


AF073515 


Hs.1 14948 


cytokine receptor-like factor 1 


1.15 


1.78 


422867 


L32137 


Hs.1 584 


cartilage oligomeric matrix protein (pse 


1.69 


3.17 


423472 


AF041260 


Hs.129057 


breast carcinoma amplified sequence 1 


48.13 


72.00 


423554 


M90516 


Hs.1674 


glutamine-fructose-6-phosphatetransanmn 


1.00 


50.00 


424502 


AF242388 


Hs.149585 


lengsin 


1.00 


1.00 


424544 


MB8700 


Hs.1 50403 


dopa decarboxylase (aromatic L-amino aci 


1.00 


59.00 


424905 


NMJJ02497 


Hs.1 53704 


NIMA (never in mitosis gene a)-related k 


21.35 


1.00 


424960 


BE245380 


Hs.153952 


5' nucleotidase (CD73) 


1.00 


too 


425523 


AB007948 


Hs.158244 


KIAA0479 protein 


1,00 


35.00 


426230 


AA367019 


Hs.241395 


protease, serine, 1 (trypsin 1) 


1.00 


83.00 


427701 


AA411101 


Hs.243886 


nuclear autoantigenic sperm protein (his 


7.41 


34.00 


428585 


AB007863 


Hs.185140 


KIAA0403 protein 


1.00 


6.00 


428758 


AA433988 


Hs.98502 


hypothetical protein FLJ 14303 


1.06 


1.13 


429170 


NMJJ01394 


Hs.2359 


dual specificity phosphatase 4 


16.18 


105.00 


429263 


AA019004 


Hs.198396 


ATP-binding cassette, sub-family A (ABC1 


1.07 


1.00 


429610 


AB024937 


Hs.211092 


LUNX protein; PLUNC (palate lung and nas 


1.59 


1.69 


430508 


AI015435 


Hs.1 04637 


ESTs 


4.75 


7.27 


430985 


AA490232 


Hs.27323 


ESTs, Weakly similar to I78885 serine/th 


0.94 


1.28 


431548 


A1834273 


Hs.9711 


novel protein 


5.66 


15.00 


431566 


AF176012 


Hs.260720 


J domain containing protein 1 


49.76 


37.00 


431986 


AA536130 


Hs.149018 


Novel human gene mapping to chomosome 20 


1.19 


1.47 


432375 


BE536069 


Hs.2962 


S100 calcium-binding protein P 


1.65 


1.06 


432677 


NM.004482 


Hs.278611 


UDP-N-acetyl-alpha-D-galactosamine:polyp 


1.00 


48,00 


433556 


W56321 


Hs.1 11460 


calcium/calmodulin-dependent protein kin 


1.00 


19.00 


433819 


AW511097 


Hs.1 12765 


ESTs 


3.71 


8.00 


434001 


AW950905 


Hs.3697 


serine (or cysteine) proteinase inhibito 


29.31 


72.00 


434424 


A1811202 


Hs.325335 


Homo sapiens cDNA: FLJ23523 fis, clone L 


1.00 


64.00 


434792 


AA649253 


Hs.132458 


ESTs 


8.52 


44.00 


436217 


T53925 


Hs.107 


fibrinogen-iike 1 


57.97 


31.00 


436749 


AA584890 


Hs.5302 


lectin, galactoside-binding, soluble, 4 


1.10 


1.41 


436972 


AA284679 


Hs.25640 


claudin 3 


1.59 


1.46 


437866 


M156781 




metallothionein 1E (functional) 


3.62 


101.00 


437935 


AW939591 


Hs.5940 


mucin 13, epithelial transmembrane 


1.60 


1.39 


438915 


AA280174 


Hs.285681 


Williams-Beuren syndrome chromosome regi 


1.00 


1.00 


439451 


AF086270 


Hs.278554 


heterochromatin-like protein 1 


23.28 


52.00 
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Ui. C77nQ 

HS.b/ /uy 


Homo sapiens mRNA full !englh insert cDN 


1 nn 

l.UU 


91 nn 


441031 


All 1U004 


Hs.7645 


fibrinogen, B beta polypeptide 


1.4 1 


qq nn 


441377 


BE218239 


Hs.202b5o 


ESTs 


££.UO 


i nn 

l.UU 


443614 


AV655386 


Hs.7645 


fibrinogen, B beta polypeptide 


1 nn 
l.UU 


ir nn 
ib.UU 


443813 


AAorboV L 


ns.yjybi 


Homo sapiens itikna, cuna Ursr^pbb/ uuyo \u 


l.ZU 


i QQ 


443991 


NM_00225Q 


Hs. 10082 


potassium intermediate/small conductance 


C "71 


C Q7 

b.o/ 


444670 


H58373 


Hs.332938 


hypothetical protein MbObovu 


i.yb 


nn 
oo.uu 


444931 


Av652uoo 


Hs.75113 


general transcription factor II iA 


i nn 
l.UU 


ka nn 
04.UU 


446102 


AW1 68067 


Hs. 3 17694 


ESTs 


1.00 


1 nn 
l.UU 


446163 


AAUZbooU 


nS.^o<cbz 


Homo sapiens cDNA FLJ 13603 fis, clone PL 


i nn 
l.UU 


•?r nn 
oo.uu 




btuy4o4o 


ns. i D 1 1 o 


homogentisate 1,2-dioxygenase (homogenti 


1 nn 


11 nn 

I I .uu 


447388 


AWbJUW4 


Hs.76277 


Homo sapiens, c)one MGC'9381, mRNA, comp 


1 OA 


J. ID 


44 {XiOC 


AfMJUUO 14 


us. io/y i 


nypoineucai protein rLj^uouf 


1.23 


1.63 


AAa'iA'i 
440Z40 


Awooy/ 1 I 


nS.DZDZU 


iniegnn, oeia o 


15.84 


1.00 


448844 




He 177164 

1 JO, J / / J ut 


ESTs 


1.00 


31.00 


449444 


AW818436 


Hs.23590 


solute carrier family 16 (monocarboxyltc 


1.00 


83.00 


451807 


W52854 




hypothetical protein FLJ23293 similar to 


1.55 


35.00 


452689 


F33868 


Hs.284176 


transferrin 


1.54 


1.44 


453392 


U23752 


Hs.32964 


SRY (sex determining region Y)-box 1 1 


1.00 


16.00 


453464 


A1884911 


Hs.32989 


receptor (calcitonin) activity modifying 


1.55 


2.45 


453735 


AI066629 


Hs.125073 


ESTs 


1.01 


1.30 



TABLE 11B 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 

Pkey CAT Number Accession 

410399 11995 1 BE068889 BE068882 AF044311 AF017256 NNL003Q87 AF037207 AF010126 AA633976 AA872836 BE298825 BE299889 A1016464 AI684600 

A1936527 AA804675 AA394097 AI139933 AA946606 BE171313 AA722407 AA293803 AI468480 M056035 AA055968 AW796957 AI637713 
AA41 0737 H49348 AA486472 AA41 1 094 AA235594 AA402624 AA443638 AW452137 AA421 708 AW26521 1 AI493266 AA3651 32 AW966044 

419502 18535J AU076704 T74854 T74860 T72098 T73265 T73873 T69180 T74658 T58786 T60385 T73410 T68781 T67845 T67593 T73952 T67864 T60630 

T68367 T68401 T53959 T72360 T72099 T60377 T58961 T7171 2 T72821 T64738 T74645 T72037 T68688 T72063 T73258 T72826 T64242 
T68220 T74673 T71800 T68355 T61227 T62738 T69317 T53850 T64692 T73768 T73962 T73382 T68914 T70975 T73400 T60631 T73277 
T73203 T70498 T61409 T58925 NM_000508 M64982 T68301 T73729 T69445 T60424 T67922 T67736 T68716 T67755 T74765 T73819 T58719 
T74756 T60477 T74863 T61 109 T68329 T58850 T71857 T73425 T53736 T68607 T58898 T64309 T72031 T72079 T64305 T71908 T681 07 
T71916 T73787 T56035 T64425 T71870 T60476 T61376T67820 T71895 T41006 T69441 T68170 T74617 T71958 T69440 T61875 R06796 
H48353 T71914 T53939 T641 21 AA693996 T72525 T67779 T68078 AA01 1465 M345378 AV654847 AV654272 AV656001 A1064740 T82897 
N33594 M344542 AW805054 AI207457 T61743 AA026737 H94389 AA382695 AA918409 T68044 S82092 T39959 AI017721 M312395 
AA312919 T40156 H66239 AV652989 H38728 R98521 AV655200 R95790 W03250 W00913 AA344136 AV660126 R97923 AA343596 
AW470774 AV651256 N54417 AA812862AW1 82929 All 11 192 H61463 H72060 AA344503 H38639 AI277511 AV661108 AI207625 T47810 
AA235252 T27853 T47778 R95746 H70620 M701463 AW827166 R98475 C20925 AV657287 T71959 T71313 T73920 T73333 T61618 T69293 
T69283 T73931 T721 78 T72456 AV645639 AV653476 T72957 T72300 T58906 T71457 T70494 T72956 T70495 T68267 T74407 T85778 
AA344726 T27854 T74485 T74101 T73868 T71518 T72304 AA343853 T73909 T68070 T72065 H72149 T73493 T73495 AV645993 R02293 
T70475 T64751 AA344441 AA343657 AA345732 AA344328 AI1 10639 AA344603 AF06351 3 T64696 T68516 T72223 T60507 T67633 R29500 
T72517 R02292 T60599 T69206 T70452 T74677 R29366 T61 277 T74914 T60352 R29675 T74843 AV645792 AA344408 T691 97 T72057 
T69368 T69358 T68258 AV650429 T73341 T61702 T74598 T40095 K02272 T40106 AA343045 M341908 AA341907 AA342807 AA341964 
T53747 T72042 T62764 AI064899 AA343060 T67832 T72440 T71770 T68091 T69108 T72449 T69167 T71289 T68251 AV654844 T64375 
AA345234 T67598 M01 1 414 T68036 H48262 AI207557 T6821 9 W86031 T69081 T64232 R931 96 T62136 AV650539 H67459 T72978 
AA344583T60362H58121 T95711 T72803T68055T71715 R29036T72793 T69122T64595T62888T69139T68291 T64652 T67971 T46862 
AA693592 AI248502 R29454 T64764 T57001 T73052 T71429 T51 1 76 T58866 AV655414 H90426 AA342489 T73666 T67848 T72512 T53835 
T67837 T7331 7 T74273 T69420 T68245 T74380 T67862 T74474 T56068 

421582 2041 1 AI910275 X00474 X52003 X05030 NM 003225 AA314326 AA308400 AA506787 AA314825 A1571948 AA507595 AA614579 AA587613 R83818 

AA568312AA614409AA307578A)925552AW950155AI910083M12075 BE074052 AW004668 AA578674 AA582084 BE074053 BE074126 
BE074140 AA514776 AA588034 BE074051 BE074068 AW009769 AW050690 AA858276 R55389 AI001051 AW050700 AW750216 AA614539 
BE074045 AI307407 AW602303 BE073575 A1202532 AA524242 AI970839 AI909751 BE076078 AI909749 R55292 

437866 44433 2 AA156781 AW293839 U52054 AA024963 AA778446 BE073977 AW444904 AW602574 BE164040 BE164012 BE163972 BE163974 BE163992 

AA837481 AW468444 BE185091 AW468002 AA687333 AA81 1830 AA58 1806 AI866686 AI572124 AA043777 AA040926 D20160AI536733 
AA812489 AW874142 AI471883 W84421 AA156850 

451807 8865J W52854 AL1 17600 BE208116 BE208432 BE206239 BE082291 AW953423 AA351619 BE180648 B E1 40560 W60080 AA865478 N90291 

AW450652 AW449519 M993634 AI806539 AA351618 AW449522 A1827626 AA904788 AA380381 M886045 AA774409 BE003229 Z41756 



TABLE 11C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand NLposition 

403329 8516120 Plus 96450-96598 

406399 9256288 Minus 63448-63554 
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TABLE 12A: Genes Distinguishing Squamous Cell Carcinoma from Other Lung Diseases and Norma) Lung 

Table 12A shows about 72 genes upregulated in squamous cell carcinomas of the lung relative to other lung tumors, non-malignant lung disease, and normal lung. These genes 
were selected from about 59680 probesets on the Eos/Affymetrix Hu03 Genechip array. 

Table 12B show the accession numbers for those Pke/s lacking UnigenelD's for table 12A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Table 12C show the genomic positioning for those Pkey's lacking Unigene ID's and accession numbers in table 12A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigeneiD: Unigene number 

Unigene Title: Unigene gene title 

R1: Average of lung tumors (including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of normal lung samples 

R2: Average of non-malignant lung disease samples {including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 



Pkey 


ExAccn 


UnigeneiD 


Unigene Title 


R1 


R2 


400289 


X07820 


Hs.2258 


matrix metalloproteinase 10 (stromelysin 


132.45 


4.00 


400666 






NM_002425:Homo sapiens matrix metallopro 


3.26 


3.22 


401780 






NMJ)05557*:Homo sapiens keratin 16 (foca 


26.47 


10.50 


401781 






Target Exon 


10.33 


4.61 


401785 






NM_002275*;Homo sapiens keratin 15 (KRT1 


4.13 


2.70 


401994 






Target Exon 


61.84 


47.00 


402075 






ENSP00000251056*:Plasma membrane calcium 


1.00 


1.00 


404996 






Target Exon 


1.00 


1.00 


407839 


AA045144 


Hs.161566 


ESTs 


173.91 


-inn aa 

108.00 


408000 


L11690 


Hs.620 


bullous pemphigoid antigen 1 (230/240kD) 


151.17 


8.00 


408522 


AI541214 


Hs.46320 


Small proline-rich protein SPRK [human, 


1.98 


1.24 


410561 


BE540255 


Hs.6994 


Homo sapiens cDNA: FLJ22044 fis, clone H 


10.04 


1.00 


415091 


AL044872 


Hs.77910 


3-hydroxy-3-methylglutaryl-Coenzyme A sy 


1.00 


30.00 


415817 


U88967 


Hs.78867 


protein tyrosine phosphatase, receptor-t 


24.30 


1.00 


416658 


U03272 


Hs.79432 


fibrillin 2 (congenita) contractus! ara 


53.29 


51,00 


417034 


NM_006183 


Hs.80962 


neurotensin 


1.00 


1.00 


417366 


BE1 85289 


Hs.1076 


small proline-rich protein 1B (cornifin) 


8.97 


3.27 


418663 


AK001100 


Hs.41690 


desmocollin 3 


112.17 


19.00 


418678 


NNL001327 


Hs.87225 


cancer/testis antigen 


1.18 


1.10 


419121 


M374372 


Hs.89626 


parathyroid hormone-like hormone 


1.00 


1.00 


420783 


AI659838 


Hs.99923 


lectin, galactoside-binding, soluble, 7 


3.04 


1.25 


421773 


W69233 


Hs. 112457 


ESTs 


1.12 


1.14 


421948 


L42583 


Hs.334309 


keratin 6A 


51.83 


20.25 


421978 


AJ243662 


Hs.1 10196 


NICE-1 protein 


1.01 


0.91 


422158 


L10343 


Hs. 112341 


protease inhibitor 3, skin-derived (SKAL 


2.37 


1.10 


422440 


NM_004812 


Hs. 116724 


aldo-keto reductase family 1, member B10 


47.53 


32.00 


423634 


AW959908 


Hs.1690 


heparin-binding growth factor binding pr 


76.02 


1.00 


423725 


AJ403108 


Hs. 1321 27 


hypothetical protein LOC57822 


4.20 


1.00 


423738 


AB002134 


Hs.132195 


airway trypsin-Iike protease 


10.14 


51.00 


424012 


AW368377 


Hs.1 37569 


tumor protein 63 kDa with strong homolog 


233.42 


68.00 


424046 


AF027866 


Hs. 138202 


serine (or cysteine) proteinase inhibito 
small proline-rich protein 3 


1.00 


1.00 


424098 


AF077374 


Hs.1 39322 


\6f.ol 


04.UU 


424834 


AK001432 


Hs.1 53408 


Homo sapiens cDNA FLJ 10570 fis, clone NT 


56.19 


12,00 


425650 


NM 001944 


Hs.1 925 


desmoglein 3 (pemphigus vulgaris antigen 


33.45 


1.00 


427099 


AB032953 


Hs.173560 


odd Oz/ten-m homolog 2 (Drosophila, mous 


4.24 


17.00 


427335 


AA448542 


Hs.251677 


G antigen 7B 


51.83 


4.00 


428182 


BE386042 


Hs.293317 


ESTs, Weakly similar to GGC1J-1UMAN G ANT 


1.00 


1.00 


428645 


AA431400 


Hs.98729 


ESTs, Weakly similar to 2017205A dihydro 


1.00 


16.00 


428748 


AW593206 


Hs.98785 


Ksp37 protein 


1.00 


87.00 


429259 


AA420450 


Hs.292911 


ESTs, Highly similar to S60712 band-6-pr 


2.01 


1.18 


429538 


BE182592 


Hs.1 1261 


small proline-rich protein 2A 


4.43 


2.90 


429903 


AL134197 


Hs.93597 


cyclin-dependent kinase 5, regulatory su 


11.80 


1.00 


430486 


BE062109 


Hs.241551 


chloride channel, calcium activated, fam 


12.28 


41.00 


430890 


X54232 


Hs.2699 


glypican 1 


1.58 


1.40 


431009 


BE149762 


Hs.48956 


gap junction protein, beta 6 (connexin 3 


60.25 


28.00 


431846 


BE019924 


Hs.271580 


uroplakin 1B 


4.49 


2.51 


433091 


Y12642 


Hs.3185 


lymphocyte antigen 6 complex, locus D 


1.20 


1.09 


434360 


AW015415 


Hs.1 27780 


ESTs 


40.98 


27.00 


434880 


U02388 


Hs.1 01 


cytochrome P450, subfamily IVF, polypept 


1.00 


1.00 


435505 


AF200492 


Hs.211238 


interleukin-1 homolog 1 


1.00 


38.00 


435793 


AB037734 


Hs.4993 


KIAA1313 protein 


23.68 


42.00 


436511 


AA721252 


Hs.291502 


ESTs 


16.76 


14.00 


438403 


AA806607 


Hs.292206 


ESTs 


1.00 


1.00 


439285 


AL133916 




hypothetical protein FLJ20093 


46.23 


139.00 


439606 


W79123 


Hs.58561 


G protein-coupled receptor 87 


33.61 


1.00 


439670 


AF088076 


Hs.59507 


ESTs, Weakly similar to AC004858 3 U1 sm 


1.00 


1.00 


439706 


AW872527 


Hs.59761 


ESTs, Weakly similar to DAP INHUMAN DEATH 


86.55 


11.00 


440325 


NM_003812 


Hs.7164 


a disintegrin and metalloproteinase doma 


62.88 


147.00 


441525 


AW241867 


Hs.1 27728 


ESTs 


1.53 


1.42 


443162 


T49951 


Hs.9029 


DKFZP434G032 protein 


31.11 


38.00 


444378 


R41339 


Hs.12569 


ESTs 


1.00 


1.00 
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446292 


AF081497 


Hs.279682 


Rh type C glycoprotein 


447078 


AW885727 


Hs.9914 


ESTs 


447342 


Al 199268 


Hs.19322 


Homo sapiens, Similar to RIKEN cDNA 2010 


449003 


X76342 


Hs.389 


alcohol dehydrogenase 7 (class IV), mu o 


449101 


AA205847 


Hs.23016 


G protein-coupled receptor 


450832 


AW970602 


Hs.1 05421 


ESTs 


452240 


A1591147 


Hs.61232 


ESTs 


453317 


NM.002277 


Hs.41696 


keratin, hair, acidic, 1 


453830 


AA534296 


Hs.20953 


ESTs 


454098 


W27953 


Hs.292911 


ESTs, Highly similar to S60712 band-6-pr 


455601 


A1368680 


Hs.816 


SRY (sex determining region Y)-box 2 


TABLE 12B 







PCT/US02/12476 



1.55 


1.26 


47.24 


24.00 


28.63 


1.00 


1.00 


1.00 


2.58 


27.00 


25.17 


36.00 


13.42 


1.00 


1.19 


1.27 


24.92 


25.00 


1.26 


1.11 


206.11 


1.00 



Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



439285 47065 1 AL133916 N79113 AFQ86101 N76721 AW950828AA364013 AW955684 AI346341 AI867454 N54784AI 655270 A1421 279 AW0 14882 

AA775552N62351 N59253 AA626243 Al 341 407 BE 175639 AA456968AI 3589 18 AA457077 



TABLE 12C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gi) numbers. "Dunham I. et al." refers to the publication entitled 'The DNA 

sequence of human chromosome 22." Dunham I. et a!., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 
Nt_position: i ndicates nucleotide positions of predicted exons . 



Pkey 


Ref 


Strand 


Nt position 


400666 


8118496 


Plus 


17982-18115,20297-20456 


401780 


7249190 


Minus 


28397-28617,28920-29045,29135-29296,29411-29567,29705-29787,30224-30573 


401781 


7249190 


Minus 


83215-83435,83531-83656,83740-83901,84237-84393,84955-85037,86290-86814 


401785 


7249190 


Minus 


165776-165996,166189-166314,166408-166569,167112-167268,167387-167469,168634-168942 


401994 


4153858 


Minus 


42904-43124,43211-43336,44607-44763,45199-45281,46337-46732 


402075 


8117407 


Plus 


121907-122035,122804-122921,124019-124161,124455-124610,125672-126076 


404996 


6007890 


Plus 


37999-38145,38652-38998,39727-39872,40557-40674,42351-42450 
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TABLE 13A: Genes Distinguishing Non-Malignant Lung Disease from Lung Tumors and Normal lung 

Table 13A shows about 23 genes upregulated in non-malignant lung disease relative to lung tumors and normal lung. These genes were selected from about 59680 probesets on 
the Eos/Affymetrix Hu03 Genechip array. 

Table 13B show the accession numbers for those Pkey's lacking UnigenelD's for table 13A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Table 13C show the genomic positioning for those Pkey's lacking Unigene ID's and accession numbers in table 13A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

Unigene) D: Unigene number 

Unigene Title: Unigene gene title 

R1: Average of lung tumors {including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of normal lung samples 

R2: Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 



Pkey 


ExAccn 


UnigenelD 


Unigene Title 


R1 


R2 


408562 


AI436323 


Hs.31141 


Homo sapiens mRNA for KIAA1568 protein, 


1.00 


230.00 


409031 


AA376836 


Hs.76728 


ESTs 


1.00 


128.00 


412372 


R65998 


Hs.285243 


hypothetical protein FLJ22029 
chemokine (C-X3-C) receptor 1 


1.00 


173.00 


415910 


U20350 


Hs.78913 


1.00 


145.00 


417511 


AL049176 


Hs.82223 


chordin-like 


1.00 


179.00 


418819 


AA228776 


Hs.191721 


ESTs 


1.00 


140.00 


422060 


R20893 


Hs.325823 


ESTs, Moderately similar to ALU5_HUMAN A 


1.00 


156.00 


424585 


AA464840 


Hs.131987 


ESTs 


1.00 


167.00 


426753 


T89832 


Hs.170278 


ESTs 


1.00 


141.00 


429496 


AA453800 


Hs.1 92793 


ESTs 


1.00 


138.00 


430719 


AA488988 


Hs.293796 


ESTs 


1.00 


133.00 


431089 


BE041395 




ESTs, Weakly similar to unknown protein 


23.32 


941.00 


431385 


BE1 78536 


Hs.11090 


membrane-spanning 4-domains, subfamily A 


1.00 


157.00 


431728 


NM_007351 


Hs.268107 


multimerin 


1.00 


157.00 


436532 


AA721522 




gb:nv54h12.r1 NCI_CGAP_Ew1 Homo sapiens 


1.00 


218.00 


437960 


AI669586 


Hs.222194 


ESTs 


1.00 


147.00 


438202 


AW1 69287 


Hs.22588 


ESTs 


1.00 


141.00 


441499 


AW298235 


Hs.101689 


ESTs 


1.00 


167.00 


444513 


AL120214 


Hs.7117 


glutamate receptor, ionotropic, AMPA 1 


1.00 


151.00 


448253 


H25899 


Hs.201591 


ESTs 


1.00 


141.00 


453636 


R67837 


Hs. 169872 


ESTs 


1.00 


116.00 


458332 


AI000341 


Hs.220491 


ESTs 


1.00 


192.00 


459587 


AA031956 




gb:zk1 5e04.s1 Soares _pregnanLuterus_NbH 


1.00 


154.00 



TABLE 13B 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT Number Accession 

431089 327825 1 B E04 1 395 AA49 1826 AA62 1946 AA7 15980 AA6661 02 
436532 421 802 J AA721 522 AW975443 T93070 



TABLE 13C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication entitled "The DNA 

sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand NLposition 

402075 8117407 Plus 121907-122035,122804-122921,124019-124161,124455-124610,125672-126076 
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TABLE 14A: Preferred Utility and Subcellular Localization for Potential Lung Disease Targets 

Table 14A shows the subcellular localization and preferred utility for the genes appearing in Tables 9A and 10A. mAb symbolizes monoclonal antibody, diag symbolizes 
diagnostic, s.m. symbolizes small molecule, and CTL symbolizes cytotoxic lymphocytic ligand. These genes were selected from 59680 probesets on the Eos/Affymetrix Hu03 
Genechip array. 

Table 14B show the accession numbers for those Pkey's lacking UnigenelD's for table 14A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Table 14C show the genomic positioning for those Pkey's lacking Unigene ID's and accession numbers in table 14A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique Eos probeset identifier number 
ExAccn: Exemplar Accession number, Genbank accession number 
UnlgenelD: Unigene number 
Unigene Title: Unigene gene title 
Pref. Utility: Preferred Utility 



Pred.Loc: 


Predicted subcellular localization 






Pkey 


ExAccn 


UnigenelD 


Unigene Title 


Pref Utility 


Pred, Loc 


400289 


X07820 


Hs.2258 


matrix meiailoproteinase 10 (stromelysin 


mAb & diag & s.m. 


extracellular 


400303 


AA242758 


Hs.79136 


UV-1 protein, estrogen regulated 


mAb 


plasma membrane 


402075 






ENSP00000251056*:P)asma membrane calcium 


mAb & diag 


secreted 


40781 1 


AW190902 


Hs.40098 


cysteine knot superfamily 1, BMP antagon 


diag 


secreted 


408243 


Y00787 


Hs.624 


interleukin 8 


diag 


secreted 


408790 


AW580227 


Hs.47860 


neurotrophic tyrosine kinase, receptor, 


mAb & s.m. 


plasma membrane 


408908 


BE296227 


Hs.250822 


serine/threonine kinase 15 


s.m. 


cytoplasm 


409041 


AB033025 


Hs.50081 


Hypothetical protein, XP_051860 (KIAA119 


CTL & diag 


secreted 


409103 


AF251237 


Hs.1 12208 


XAGE-1 protein 


CTL 


nuclear 


409420 


Z15008 


Hs.54451 


laminin, gamma 2 {nlcein (100kD), kalini 


diag 


secreted 


409632 


W74001 


Hs.55279 


serine (or cysteine) proteinase inhibito 


diag 


secreted 


409757 


NM 001898 


Hs.123114 


cystatin SN 


diag 


extracellular 


409893 


AW247090 


Hs.57101 


mlnichromosome maintenance deficient (S. 


CTL 


nuclear 


409956 


AW103364 


Hs.727 


inhibin, beta A (activin A, activin AB a 


diag 


extracellular 


410001 


AB041036 


Hs.57771 


kallikrein 1 1 


diag 


extracellular 


410407 


X66839 


Hs.63287 


carbonic anhydrase IX 


mAb & s.m. 


plasma membrane 


410418 


D31382 


Hs.63325 


transmembrane protease, serine 4 


mAb & diag & s.m. 


plasma membrane 


412140 


M219691 


Hs.73625 


RAB6 interacting, kinesin-like (rabkines 


s.m. 




412719 


AW016610 


Hs.816 


ESTs 


s.m. 


nuclear 


414774 


X02419 


Hs.77274 


plasminogen activator, urokinase 


diag 


extracellular 


414883 


AA926960 




CDC28 protein kinase 1 


s.m. 




415138 


C18356 


Hs.295944 


tissue factor pathway inhibitor 2 


CTL & diag 


extracellular 


415669 


NM 005025 


Hs,78589 


serine (or cysteine) proteinase inhibito 


mAb & diag & s.m. 


secreted 


415817 


U88967 


Hs.78867 


protein tyrosine phosphatase, receptor-t 


mAb & s.m. 


plasma membrane 


416658 


U03272 


Hs.79432 


fibrillin 2 (congenital contractural ara 


diag 


extracellular 


417034 


NM 006183 


Hs.80962 


neurotensin 


diag 


extracellular 


417079 


U65590 


Hs.81134 


interleukin 1 receptor antagonist 


diag 


extracellular 


417308 


H60720 


Hs.81892 


KIAA0101 gene product 


s.m. 


mitochondrial 


417389 


BE260964 


Hs.82045 


midkine (neurite growth-promoting factor 


mAb & diag 


secreted 


417433 


BE270266 


Hs.82128 


5T4 oncofetal trophoblast glycoprotein 


mAb 


plasma membrane 


417933 


X02308 


Hs.82962 


thymidylate synthetase 


s.m. 


endoplasmic reticulum 


418478 


U38945 


Hs.1 174 


cyclin-dependent kinase inhibitor 2A (me 


s.m. 


cytoplasm 


418506 


AA084248 


Hs.85339 


G protein-coupled receptor 39 


mAb & s.m. 


plasma membrane 


418678 


NM.001327 


Hs.1 67379 


cancer/testis antigen (NY-ESO-1) 


CTL 


cytoplasmic 


419121 


AA374372 


Hs.89626 


parathyroid hormone-like hormone 


diag 


secreted 


419171 


NM.002846 


Hs.89655 


protein tyrosine phosphatase, receptor t 


mAb & s.m. 


plasma membrane 


419183 


U60669 


Hs.89663 


cytochrome P450, subfamily XXIV {vitamin 


CTL & s.m. 


mitochondrial 


419216 


AU076718 


Hs.1 64021 


small inducible cytokine subfamily B (Cy 


diag 


secreted 


419235 


AW470411 


Hs.288433 


neurotrimin 


mAb & diag 


plasma membrane 


419452 


U33635 


Hs.90572 


PTK7 protein tyrosine kinase 7 


mAb & s.m. 


plasma membrane 


419556 


U29615 


Hs.91093 


chitinase 1 (chitotriosidase) 


mAb & diag 


extracellular* 


420610 


AI683183 


Hs.99348 


distal-less homeo box 5 


CTL 


nuclear 


421110 


AJ250717 


Hs.1 355 


cathepsin E 


sm & diag 


extracellular 


421379 


Y15221 


Hs.103982 


small inducible cytokine subfamily B (Cy 


diag 


secreted 


421474 


U76362 


Hs.1 04637 


solute carrier family 1 (glutamate trans 


mAb & s.m. 


plasma membrane 


421552 


AF026692 


Hs.105700 


secreted frizzied-related protein 4 


diag 


secreted 


421753 


BE314828 


Hs.107911 


ATP-binding cassette, sub-family B (MDR/ 


mAb & s.m. 


plasma membrane 


421817 


AF146074 


Hs.108660 


ATP-binding cassette, sub-family C (CFTR 


mAb & s.m. 


plasma membrane 


422109 


S73265 


Hs.1473 


gastrin-releasing peptide 


diag 


secreted 


422158 


L10343 


Hs.112341 


protease inhibitor 3, skin-derived {SKAL 


diag 


secreted 
secreted 


422282 


AF019225 


Hs.1 14309 


apolipoprotein L 


diag 


422283 


AW411307 


Hs.114311 


CDC45 (cell division cycle 45, S.cerevis 


s.m. 


nuclear 


422424 


AI186431 


Hs.296638 


prostate differentiation factor 


diag 


extracellular 


422765 


AW409701 


Hs.1578 


bacutoviral IAP repeat-containing 5 (sur 


s.m. 


cytoplasm 


422809 


AK001379 


Hs.121028 


hypothetical protein FLJ 10549 


s.m. 


nuclear 


422867 


L32137 


Hs.1584 


cartilage oltgomeric matrix protein (pse 


diag 


extracellular 


422956 


BE545072 


Hs.1 22579 


ECT2 protein (Epithelial cell transform! 


CTL & s.m. 




423634 


AW959908 


Hs.1690 


heparin-binding growth factor binding pr 


diag 




423673 


BE003054 


Hs.1695 


matrix metalloproteinase 12 (macrophage 


mAb & diag & s.m. 


secreted 


423961 


D13666 


Hs.136348 


periostin (OSF-2os) 


mAb & diag 


extracellular 


424046 


AF027866 


Hs.1 38202 


serine (or cysteine) proteinase inhibito 


diag 


secreted 


424381 


AA285249 


Hs.146329 


protein kinase Chk2 


s.m. 


nuclear 
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TABLE 14B 



Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 

Pkey CAT Number Accession 
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WO 02/086443 

414883 15024J 



450375 83327J 



PCT/US02/12476 

AA926960 M926959 W76521 W24270 W21526 AA037172 BE267636 H83186 AA469909 N86396 AA001348 BE535736 AA081745 BE566245 
AA082436 H72525 H77575 N49786 W80565 H78746 BE569085 W04339 R981 27 T55938 BE279271 AW960304 T29812 AA476873 BE297387 
AA292753 AA177048 NM_001826 X54941 BE314366 AA908783 A1719075 BE270172 BE269819 M889955 AI204630 W25243 AI935150 
AA872039 W72395 T99630 AI422691 H98460 N31428 BE255916 H03265 AI857576 M776920 AA910644 AA459522 AA293140 AW514667 
R75953 AW662396 AA662522 AI865147 A1423153 AW262230 AA584410 AA583187 AW024595 AW069734 AI828996 M282997 AA876046 
AW61 3002 AA527373 AW972459 AI831360 AA621 337 AA1 00926 AA77241 8 M594628 AI033892 W95096 AI034317 M398727 AI085031 
N95210 A1459432 AI041437 AA932124 AA627684 AA935829 AI004827 A1423513 AI094597 H42079 R54703 AI630359 AA617681 M978045 
AA643280 W44561 AI991988 AI537692 AI090262 AA740817 AI312104 AI91 1822 M416871 AI185409 AA129784 AA701623 AI075239 
AI139549 AA633648 AI339996 AI336880 M399239 AI078708 AI085351 AI362835 AI34661 8 AI146955 A1989380 AI348243 N92892 M765850 
AI494230 AI278887 AA962596 AI492600 W80435 AA001979 R97424 Al 1 2901 5 N24127 AA1 57451 AA235549 AA459292 AA0371 14 AA129785 
A1494211 AW059601 AW886710 R92790 N59755 A136112B AW589407 H47725 H97534 H48076 H48450T99631 AW300758 H03431 R76789 
AA954344 H77576 R96823 AI457100 N92845 N49682 H42038 BE220698 BE220715 H99552 M701624 N74173 R54704 H79520 H72923 
H03266 BE261 919 AA769633 AA48031 0 AA507454 AA910586 AI203723 AW1 04725 W2561 1 W25071 T88980 H0351 3 T77589 R991 56 
W95095 R97470 AA702275 T77551 AA91 1 952 H82956 N83673 M283672 

AA009647 AA1 31254 AA374293 AW954405 H04410 AW606284 AA151166 BE1 57467 BE1 57601 H04384 W46291 AW663674 H04021 H01532 
AA190993 H03231 H59605 H01642 AA852876 AA1 13758 AA626915 AA746952 AI161014 AA099554 R69067 



TABLE 14C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (GI) numbers. "Dunham I. et a)." refers to the publication entitled "The DNA 

sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 

Nt_position: Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand NLposition 

402075 8117407 Plus 121907-122035,122804-122921,124019-124161,124455-124610,125672-126076 
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PCT/US02/12476 



TABLE 15A: Information for all sequences in Table 16 

Table 15A shows the Seq ID No, Pkey, ExAccn, UnigenelD, and Unigene Title for all of the sequences in Table 16. 

Table 15B show the accession numbers for those Pkey's lacking UnigenelD's for table 15A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools {DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Table 15C show the genomic positioning for those Pkey's lacking Unigene ID's and accession numbers in table 15A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Seq ID No: Sequence ID number 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 



Seq ID No: 

Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No; 
Seq ID No: 
Seq ID No: 
Seq ID No: 



1&2 
3&4 
5&6 
7&8 
9&10 
11 & 12 
13&14 
15 & 16 
17 & 18 
19&20 
21 & 22 
23&24 
2S&26 
27&28 
29&30 
31&32 
33 & 34 
35 & 36 
37&3B 
39&40 
41&42 
43&44 
45&46 
47&48 
49 

50&51 
52&53 
54&55 
56&57 
58&59 
60&61 
62&63 
64&65 
66&67 
68&69 
70&71 
72&73 
74&75 
76&77 
78&79 
80&81 
82&83 
84&85 
86&87 
88&89 
90&91 
92&93 
94&95 
96&97 
98&99 
100 & 101 
102 & 103 
104 & 105 
106 & 107 
108 & 109 
110&111 
: 112 & 1 13 
114 & 115 
116 

: 117 & 118 
: 119 & 120 
121 & 122 
123 & 124 
i: 125 & 126 



Pkey 

410407 
412719 
417034 
430486 
407788 
407788 
407788 
407788 
439285 
413753 
120486 
425650 
412140 
423673 
452838 
418663 
418663 
409632 
429610 
406690 
431846 
418830 
424098 
443648 
311034 
408522 
422158 
435505 
417366 
431958 
441020 
423217 
429538 
448733 
444371 
444371 
444371 
422168 
422168 
429259 
426440 
437044 
423662 
428484 
429211 
417389 
423634 
417515 
441362 
425322 
449003 
431009 
409103 
417542 
428471 
418004 
414761 
418203 
447343 
437016 
449230 
446989 
457819 
424687 



ExAccn 

X66839 

AW016610 

NM_006183 

BE062109 

BE514982 

BE514982 

BE514982 

BE514982 

AL133916 

U 17760 

AW368377 

NM 001944 

AA219691 

BE003054 

U65011 

AK001100 

AK001100 

W74001 

AB024937 

M29540 

BE019924 

BE513731 

AF077374 

AI085377 

BE567130 

AI541214 

L10343 

AF200492 

BE185289 

X63629 

W79283 

NMJ300094 

BE182592 

NM_005629 

BE540274 

BE540274 

BE540274 

M586894 

AA586894 

M420450 

BE382756 

AL035864 

AK001035 

AF1 04032 

AF052693 

BE260964 

AW959908 

L24203 

BE614410 

U63630 

X76342 

BE149762 

AF251237 

J04129 

X57348 

U37519 

AU077228 

X54942 

M256641 

AU076916 

BE613348 

AK001898 

AA057484 

J05070 



UnigenelD 
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AI00/41Z 


Ue 1H7R01 
MS. IO/ OU 1 


447033 


AIQC7JHO 

Aloo/41z 


Ue 1R7RH1 
MS.IO'OUI 


115522 


bbo14oo/ 


Ue n^RQI 

MS.ooooyo 


410418 


LJO 1 OOZ 


He 63325 


409041 


AB033025 


Hs.50081 


409041 


AB033025 


Hs.50081 


452461 


N78223 


Hs.108106 


412420 


AL035668 


Hs.73853 


416658 


U03272 


Hs.79432 


407811 


AW1 90902 


Hs.40098 



glutamate-cysteine ligase, catalytic sub 
ESTs, Weakly similar to T17330 hypotheti 
ESTs, Weakly similar to T17330 hypotheti 
ESTs, Weakly similar to T17330 hypotheti 
ESTs, Weakly similar to T17330 hypotheti 
ESTs, Weakly similar to T17330 hypotheti 
High mobility group (nonhistone chromoso 
NMJ)22342:Homo sapiens kinesin protein 9 
ESTs 

presenilis associated rhomboid-like pro 
nitric oxide synthase 2A (inducible, hep 
ATPase, Class Vf, type 11B 
forkhead box E1 (thyroid transcription f 
ESTs, Weakly similar to 2109260A B cell 
peptidylglycine alpha-amidating monooxyg 
hypothetical protein MGC5350 
ESTs 

unnamed protein product [Homo sapiens] 
minichromosome maintenance deficient {S. 
v-ets erythroblastosis virus E26 oncogen 
ESTs, Weakly similar to S41044 chromosom 
guanine nucleotide binding protein 11 
calcitonin receptor-iike 
cadherin 5, type 2, VE-cadherin (vascula 
singed (Drosophila)-like (sea urchin fas 
Homo sapiens HSPC285 mRNA, partial cds 
complement component C1q receptor 
ESTs 

protease inhibitor 3, skin-derived (SKAU 
plakophilin 3 

RAN, member RAS oncogene family 
parathyroid hormone-like hormone 
low density lipoprotein receptor-related 
endogenous retroviral protease 
collagen, type Xt, alpha 1 
SRY (sex determining region Y)-box 4 
guanine monphosphate synthetase 
pituitary tumor-transforming 1 
insulin-like growth factor binding prate 
SRB7 (suppressor of RNA polymerase B, ye 
butyrate-induced transcript 1 
butyrate-induced transcript 1 
small proline-rich protein 1B (comifin) 
H2A histone family, member X 
gb:Homo sapiens full length insert cDNA 
glycoprotein (transmembrane) nmb 
aipha-fetoprotein 

integrin, alpha 5 (fibronectin receptor, 
matrix metatloproteinase 10 (stromelysin 
matrix metalloproteinase 1 (interstitial 
matrix metalloproteinase 1 (interstitial 
solute carrier family 7, (cationic amino 
tissue factor pathway inhibitor 2 
G protein-coupled receptor 39 
periostin (OSF-2os) 
monokine induced by gamma interferon 
5T4 oncofetal trophoblast glycoprotein • 
5T4 oncofetal trophoblast glycoprotein 
cartilage oligomeric matrix protein (pse 
small inducible cytokine subfamily B (Cy 
ESTs, Weakly similar to S64054 hypotheti 
LIV-1 protein, estrogen regulated 
Adlican 

KIAA1866 protein 

hypothetical protein FLJ21080 

secreted frizzted-related protein 4 

Ig superfamily receptor LNIR 

a disintegrin and metalloproteinase doma 

stanniocalcin 2 

matrix metalloproteinase 1 1 (stromelysin 
Transmembrane protease, serine 3 
collagen, type X, alpha 1 (Schmid metaph 
ESTs; hypothetical protein for !MAGE:447 
gap junction protein, beta 2, 26kD (conn 
ESTs 
ESTs 
ESTs 

c-Myc target JP01 
transmembrane protease, serine 4 
Hypothetical protein, XPJJ51860 (KIAA119 
Hypothetical protein, XPJ51B60 (KIAA1 19 
transcription factor 
bone morphogenetic protein 2 
fibrillin 2 (congenital contractual ara 
cysteine knot superfamily 1, BMP antagon 
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Seq ID No: 462 & 463 437852 

Seq ID No: 464 & 465 402075 

SeqlDNo:466&467 421110 

Seq ID No: 468 & 469 451668 

Seq ID No: 470 & 471 451668 

Seq ID No: 472 & 473 451668 

Seq ID No: 474 & 475 422282 

Seq ID No: 476 & 477 425852 

Seq ID No: 478 & 479 439738 

Seq ID No: 480 & 481 427747 

Seq ID No: 482 & 483 420281 

Seq ID No: 484 & 485 405932 

Seq ID No: 486 & 487 405932 

Seq ID No: 488 & 489 444342 

Seq ID No: 490 & 491 421379 

Seq ID No: 492 & 493 417079 

Seq ID No: 494 & 495 430890 

Seq ID No: 496 & 497 419721 

Seq ID No: 498 & 499 444471 

Seq ID No: 500 & 501 413063 

Seq ID No: 502 & 503 433800 

Seq ID No: 504 & 505 452401 

Seq ID No: 506 & 507 452401 

Seq ID No: 508 & 509 450001 

SeqlDNo:510&511 410407 

Seq ID No: 512 & 513 309931 

Seq ID No: 514 & 515 412719 

Seq ID No: 516 & 517 417034 

Seq ID No: 518 & 519 430486 

Seq ID No: 520 & 521 413753 

Seq ID No: 522 & 523 425650 

Seq ID No: 524 & 525 423673 

Seq ID No: 526 & 527 418663 

Seq ID No: 528 & 529 418663 

Seq ID No: 530 & 531 429610 

Seq ID No: 532 & 533 406690 

Seq ID No: 534 & 535 431846 

Seq ID No: 536 & 537 422158 

Seq ID No: 538 & 539 431958 

Seq ID No: 540 & 541 437044 

Seq ID No: 542 & 543 428484 

Seq ID No: 544 & 545 429211 

Seq ID No: 546 & 547 417389 

Seq ID No: 548 & 549 431009 

Seq ID No: 550 & 551 417542 

Seq ID No: 552 & 553 449230 

Seq ID No: 554 & 555 410555 

Seq ID No: 556 & 557 410555 

Seq ID No: 558 & 559 424687 

Seq ID No: 560 & 561 418462 

Seq ID No: 562 & 563 410274 

Seq ID No: 564 & 565 439606 

Seq ID No: 566 & 567 404877 

Seq ID No: 568 & 569 444781 

Seq ID No: 570 & 571 418543 
Seq ID No: 572 & 573 415817 

Seq ID No: 574 & 575 415817 

Seq ID No: 576 & 577 415817 

Seq ID No: 578 & 579 415817 

Seq ID No: 580 & 581 415817 

Seq ID No: 582 & 583 415817 
Seq ID No: 584 & 585 421817 

Seq ID No: 586 & 587 418678 

Seq ID No: 588 & 589 418678 
Seq ID No: 590 & 591 409420 
Seq ID No: 592 & 593 332180 
Seq ID No: 594 & 595 408790 
Seq ID No: 596 & 597 408790 
Seq ID No: 598 & 599 439223 
Seq ID No: 600 & 601 409757 
Seq ID No: 602 & 603 428969 
Seq ID No: 604 & 605 428969 
Seq ID No: 606 & 607 428969 
Seq ID No: 608 & 609 428969 
Seq ID No: 61 0& 611 450701 
Seq ID No: 612 & 613 450701 
Seq ID No: 614 & 615 414774 
Seq ID No: 616 & 617 407944 
Seq ID No: 618 & 619 407944 
Seq ID No: 620 & 621 457489 
Seq ID No: 622 & 623 429547 
Seq ID No: 624 & 625 407242 
Seq ID No: 626 & 627 407242 
Seq ID No: 628 & 629 407242 
Seq ID No: 630 & 631 444006 



BE001836 


Hs.256897 


AJ250717 


Hs.1355 


Z43948 


I l_ flfifl A A A 

Hs.326444 


Z43948 


Hs.326444 


Z43948 


■ ]_ nnp AAA 

Hs.326444 


AF019225 


Hs.1 14309 


a l / Art 4 rrf\ A 

AK001504 


|J_ A cnccA 

Hs.159651 


n r~ A A A/~ A A 

BE246502 


Li. ncno 

Hs.9598 


A 1 Af A 4 4 iOC 

AW41 1425 


Hs.loUobo 


AI623693 


HS.323494 


NM_014398 


1 J- a nan-/ 

Hs.10887 


Y15221 


Hs.1 03982 


i teccnn 

U65590 


l)» QA A nt 


X54232 


1 T — A A A A 

Hs.2699 


NM_001650 


Hs.288650 


AB020684 


1 l~ A A ItA ~I 

Hs.1 1217 


AL035737 


Hs.75184 


Al 034361 


Hs. 135150 


111 1 AATd A r 

NM_007115 


Hs.29352 


tin n atm 4 c 

NM_007115 


Hs.29352 


nil f\f\4f\AA 

NM_001044 


HS.4UD 


X66839 


HS.b32o7 


AW341683 




AW016610 


Hs.816 


tit a flftfti n A 

NM_006183 


Hs.o0y62 


BE062109 


1 L nu CCA 

Hs.241551 


U 17760 


Hs.75517 


NM_QQ1944 


Hs.1925 


BE003054 


Hs.1695 


AK001100 


II. A 4 AAA 

Hs.41690 


AK001100 


i i- a a onn 

Hs.41690 


AB024937 


4 1 — A4 A AAA 

Hs.211092 


M29540 


1 I— A A A 1™ A A 

Hs.220529 


BE019924 


\ t_ A*7> 4 f? AA 

Hs.271580 


L10343 


Hs. 11 2341 


X63629 


Hs.2877 


AL035864 


Hs.69517 


AF1 04032 


l|_ A OACl\A 

Hs.1 84601 


AF052693 


II- 4000/10 

Hs.1 98249 


BE260964 


Hs. 82045 


n^4 A ATA A 

BE 149762 


HS.48956 


J04129 


Hs. 82269 


BE613348 


Hs.211579 


U92649 


Hs.64311 


i innc Art 

U92649 


Hs.6431 1 


J05070 


Hs. 151738 


BE001596 


Hs.85266 


A A A n 4 n A*7 

AA381807 


Hs.61762 


1 A Hrt 4 AA 

W79123 


1 1— cacr*A 

Hs.58561 


liii n a a Ann 

NMJJ14400 


Hs.1 1950 


■ in nnann 

NMJJ05329 


it. ocoeo 

Hs.85962 


U88967 


Um 700C7 


U8o9b7 


Un 7QQC7 


■ ioonc7 
Uoobb/ 


HS./OOO/ 


U88967 


Hs.78867 


U88967 


Hs.78867 


U88967 


Hs.78867 


a r-A jom A 

AF1 46074 


Hs. 108660 


NMJ)01327 


Hs. 167379 


NML001327 


Hs.1 67379 


Z15008 


Hs.54451 


AF134160 


Hs.7327 


AW580227 


Hs.47860 


AW580227 


1 1— J A A A 

Hs.47860 


AW238299 


1 1. AfT A A 4 A 

Hs.250618 


NM_001898 


Hs.123114 


AF 120274 


Hs.1 94689 


AF1 20274 


Hs.1 94689 


a r— A A A A*T A 

AF 120274 


Hs.1 94689 


A t~A A A AT A 

AF1 20274 


IJ_ 40/tGQO 

Hs.194bo9 


H39960 


HS.2oo4b/ 


H39960 


HS.2oo4b/ 


w A A A 4 A 

X02419 


U- 7717/1 


R34008 


Hs.239727 


R34008 


ij. OQQ707 

HS.2Jy/4C/ 


AI693815 


Hs.1 27179 


AW009166 


Hs.99376 


M18728 




M18728 




M18728 




BE395085 


Hs.10086 



ESTs, Weakly similar to dJ36501 2.1 [H.sa 
ENSP00000251056*:Plasma membrane calcium 
cathepsin E 

cartilage acidic protein 1 
cartilage acidic protein 1 
cartilage acidic protein 1 
apolipoprotein L 

death receptor 6, TNF superfamily member 
sema domain, immunoglobulin domain (lg), 
serine/threonine kinase 12 
Predicted cation efflux pump 
C15000305:gi|3806122|gb|AAC69198.1|(AFO 
C1 5000305:gi|38061 22|gb|AAG691 98. 1 j (AF0 
similar to lysosome-associated membrane 
small inducible cytokine subfamily B (Cy 
interleukm 1 receptor antagonist 
glypican 1 
aquaporin 4 
KIAA0877 protein 

chitinase 3-Iike 1 (cartilage glycoprote 
lung type-l cell membrane-associated gly 
tumor necrosis factor, alpha-induced pro 
tumor necrosis factor, alpha-induced pro 
solute carrier family 6 {neurotransmitte 
carbonic anhydrase IX 

gb:hd13d01,x1 Soares_NFLJ"J3BC_S1 Homos 
ESTs 

neurotensin 

chloride channel, calcium activated, fam 
laminin, beta 3 (nicein (1 25kD), kalinin 
desmoglein 3 (pemphigus vulgaris antigen 
matrix metalloproteinase 12 (macrophage 
desmocollin 3 
desmocollin 3 

LUNX protein; PLUNC (palate lung and nas 
carcinoembryonic antigen-related cell ad 
uroplakin 1B 

protease inhibitor 3, skin-derived (SKAL 
cadherin 3, type 1, P-cadherin (placenta 
differentially expressed in Fanconi's an 
solute carrier family 7 (cationic amino 
gap junction protein, beta 5 (connexin 3 
midklne (neurite growth-promoting factor 
gap junction protein, beta 6 (connexin 3 
progestagen-associated endometrial prate 
melanoma cell adhesion molecule 
a disintegrin and metalloproteinase doma 
a disintegrin and metalloproteinase doma 
matrix metalloproteinase 9 (gelatinase B 
integrin, beta 4 
hypoxia-inducible protein 2 • 
G protein-coupled receptor 87 
NM__005365:Homo sapiens melanoma antigen, 
GPI-anchored metastasis-associated prote 
hyaluronan synthase 3 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
ATP-binding cassette, sub-family C (CFTR 
cancer/testis antigen (NY-ESO-1) 
cancer/testis antigen (NY-ESO-1) 
laminin, gamma 2 (nicein (100kD), kalini 
claudin 1 

neurotrophic tyrosine kinase, receptor, 

neurotrophic tyrosine kinase, receptor, 

UL16 binding protein 2 

cystatin SN 

artemtn 

artemin 

artemin 

artemin 

hypothetical protein XP_098151 (leucine- 
hypothetical protein XP_098151 (leucine- 
plasminogen activator, urokinase 
desmocollin 2 
desmocollin 2 
cryptic gene 
ESTs 

gb:Human nonspecific crossreacting antig 
gb:Human nonspecific crossreacting antig 
gb:Human nonspecific crossreacting antig 
type I transmembrane protein Fn14 
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Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 
Seq ID No: 



632 & 633 
634 & 635 
636 & 637 
638 & 639 
640 & 641 
642 & 643 
644 & 645 
646 & 647 
648 4 649 
650 & 651 
652 & 653 
654 & 655 
656 & 657 
658 & 659 
660 & 661 
662 & 663 
664 & 665 
666 & 667 
668 & 669 
670 & 671 
672 & 673 
674 & 675 
676 & 677 
678 & 679 
680 & 681 
682 & 683 
684 & 685 
686 & 687 
688 & 689 



429597 
422109 
419235 
449048 
419216 
431462 
448243 
426427 
445537 
422278 
428450 
446619 
453392 
426514 
425776 
425776 
431515 
419452 
432653 
432653 
432653 
432653 
410001 
426501 
408369 
445413 
422424 
428330 
420610 



NM_003816 

S73265 

AW470411 

Z45051 

AU076718 

AW583672 

AW369771 

M86699 

AJ245671 

AF072873 

NM 014791 

AU076643 

U23752 

BE616633 

U25128 

U25128 

NM_012152 

U33635 

N62096 

N62096 

N62096 

N62096 

AB041036 

AW043782 

R38438 

AA151342 

AI186431 

L22524 

AI683183 



TABLE 15B 



Hs.2442 a dfsintegrin and metalto proteinase doma 

Hs.1473 gastrin-releasing peptide 

Hs.288433 neurotrimln 

Hs.22920 similar to S68401 (cattle) glucose indue 

Hs.164021 small inducible cytokine subfamily B {Cy 

Hs.25631 1 granin-Iike neuroendocrine peptide precu 

Hs.52620 integrin, beta 8 

Hs,1 69840 TTK protein kinase 

Hs.12844 EGF-like-domain, multiple 6 

Hs. 114218 frizzled (Drosophila) homolog 6 

Hs.184339 KIM0175 gene product 

Hs.313 secreted phosphoprotein 1 (osteopontin, 

Hs.32964 SRY (sex determining region Y)-box 1 1 

Hs.170195 bone morphogenetic protein 7 (osteogenic 

Hs. 1 59499 parathyroid hormone receptor 2 

Hs. 1 59499 parathyroid hormone receptor 2 

Hs.258583 endothelial differentiation, lysophospha 

Hs.90572 PTK7 protein tyrosine kinase 7 

Hs.2931 85 ESTs, Weakly similar to JC7328 amino aci 

Hs,2931 85 ESTs, Weakly similar to JC7328 amino aci 

Hs.2931 85 ESTs, Weakly similar to JC7328 amino aci 

Hs.2931 85 ESTs, Weakly similar to JC7328 amino aci 

Hs.57771 kallikreinU 

Hs.293616 ESTs 

Hs.1 82575 solute carrier family 15 (H??? transport 

Hs.12677 CGI-147 protein 

Hs.296638 prostate differentiation factor 

Hs.2256 matrix metalloproteinase 7 (matrilysin, 

Hs.99348 distal-less homeo box 5 



Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT Number Accession 

309931 AW341683 

330493 33264_5 

439285 47065J 



M27826 R78416 AA307645 AW957879 AW957800 M633529 H03662 

AL133916 N79113 AF086101 N76721 AW950828 AA3640 13 AW9556 84 Al 346341 A1867454 N54784 AI655270 A1421279 AW014882 
AA775552 N62351 N59253 M626243 AI341407 BE175639 M456968 AI358918 AA457077 
450375 83327 1 AA009647 AA131254 AA374293 AW954405 H04410 AW606284 AA151166 BE157467 BE157601 H04384 W46291 AW663674 H04021 H01532 

AA190993 H03231 H59605 H01642 M852876 M113758 AA626915 AA746952 A1161014 AA099554 R69067 
451 320 86576 J AW1 1 8072 AI631 982 T1 5734 AA2241 95 AI701458 W201 98 F26326 AA890570 N90552 AW071 907 AI671 352 AI375892 T0351 7 R88265 

AM 24088 AA224388 A1084316 AI354686 T33652 A1140719 A172021 1 T03490 AI372637 T15415 AW205836 AA630384 T03515 T33230 
AA017131 M443303T33623AI222556T33511 T33785AI419606 D55612 



TABLE 15C 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 
NLposition: Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand NLposition 

402075 8117407 Plus 121907-122035,122804-122921,124019-124161,124455-124610,125672-126076 

403329 8516120 Plus 96450-96598 

403478 9958258 Plus 116458-116564 

404440 7528051 Plus 80430-81581 

404877 1519284 Plus 1095-2107 

405770 2735037 Plus 61057-62075 

405932 7767812 Minus 123525-123713 
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Seq ID NO: 1 DNA sequence 
Nucleic Acid Accession #: NM_001216 
5 Coding sequence: 4 3.. 142 2 

1 11 21 31 41 51 

I 1 I I I I 

GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GCATGGCTCC CCTGTGCCCC 60 

10 AGCCCCTGGC TCCCTCTGTT GATCCCGGCC CCTGCTCCAG GCCTCACTGT GCAACTGCTG 120 

CTGTCACTGC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCCGGAT GCAGGAGGAT 180 

TCCCCCTTGG GAGGAGGCTC TTCTGGGGAA GATGACCCAC TGGGCGAGGA GGATCTGCCC 240 

AGTGAAGAGG ATTCACCCAG AGAGGAGGAT CCACCCGGAG AGGAGGATCT ACCTGGAGAG 3 00 

GAGGATCTAC CTGGAGAGGA GGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAGAGGGC 3 60 

15 TCCCTGAAGT TAGAGGATCT ACCTACTGTT GAGGCTCCTG GAGATCCTCA AGAACCCCAG 420 

AATAATGCCC ACAGGGACAA AGAAGGGGAT GACCAGAGTC ATTGGCGCTA TGGAGGCGAC 480 

CCGCCCTGGC CCCGGGTGTC CCCAGCCTGC GCGGGCCGCT TCCAGTCCCC GGTGGATATC 540 

CGCCCCCAGC TCGCCGCCTT CTGCCCGGCC CTGCGCCCCC TGGAACTCCT GGGCTTCCAG 600 

CTCCCGCCGC TCCCAGAACT GCGCCTGCGC AACAATGGCC ACAGTGTGCA ACTGACCCTG 660 

20 CCTCCTGGGC TAGAGATGGC TCTGGGTCCC GGGCGGGAGT ACCGGGCTCT GCAGCTGCAT 720 

CTGCACTGGG GGGCTGCAGG TCGTCCGGGC TCGGAGCACA CTGTGGAAGG CCACCGTTTC 780 

CCTGCCGAGA TCCACGTGGT TCACCTCAGC ACCGCCTTTG CCAGAGTTGA CGAGGCCTTG 840 

GGGCGCCCGG GAGGCCTGGC CGTGTTGGCC GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AGTGCCTATG AGCAGTTGCT GTCTCGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAGACT 960 

25 CAGGTCCCAG GACTGGACAT ATCTGCACTC CTGCCCTCTG ACTTCAGCCG CTACTTCCAA 1020 

TATGAGGGGT CTCTGACTAC ACCGCCCTGT GCCCAGGGTG TCATCTGG AC TGTGTTTAAC 1080 

CAGACAGTGA TGCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGGGGACCT 1140 

GGTGACTCTC GGCTACAGCT GAACTTCCGA GCGACGCAGC CTTTGAATGG GCGAGTGATT 120 0 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTCGGG CTGCTGAGCC AGTCCAGCTG 12 60 

30 AATTCCTGCC TGGCTGCTGG TGACATCCTA GCCGTGGTTT TTGGCCTCCT TTTTGCTGTC 132 0 

ACCAGCGTCG CGTTCCTTGT GCAGATGAGA AGGCAGCACA GAAGGGGAAC CAAAGGGGGT 1380 

GTGAGCTACC GCCCAGCAGA GGTAGCCGAG ACTGGAGCCT AGAGGCTGGA TCTTGGAGAA 1440 

TGTGAGAAGC CAGCCAGAGG CATCTGAGGG GGAGCCGGTA ACTGTCCTGT CCTGCTCATT 1500 
ATGCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA AATATTTATA AT 
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Seq ID NO: 2 Protein sequence: 
Protein Accession #: NP 001207 



1 11 21 31 41 51 

40 i | | l l I 

MAPLCPSPWL PLLIPAPAPG LTVQLLLSLL LLMPVHPQRL PRMQEDSPLG GGSSGEDDPL 60 

GEEDLPSEED SPREEDPPGE EDLPGEEDLP GEEDLPEVKP KSEEEGSLKL EDLPTVEAPG 120 

DPQEPQNNAH RDKEGDDQSH WRYGGDPPWP RVSPACAGRF QSPVDIRPQL AAFCPALRPL 180 

ELLGFQLPPL PELRLRNNGH SVQLTLPPGL EMALGPGREY RALQLHLHWG AAGRPGSEHT 240 

45 VEGHRF PAE I HWHLSTAFA RVDEALGRPG GLAVLAAFLE EGPEENSAYE QLLSRLEEIA 300 

EEGSETQVPG LDISALLPSD FSRYFQYEGS LTTPPCAQGV IWTVFNQTVM LSAKQLHTIiS 360 

DTLWGPGDSR LQLNFRATQP LNGRVIEASF PAGVDSSPRA AEPVQLNSCL AAGDILALVF 42 0 
GLLFAVTSVA FLVQMRRQHR RGTKGGVSYR PAEVAETGA 

50 Seq ID NO: 3 DNA sequence 

Nucleic Acid Accession #: BC013923 
Coding sequence: 43 8-1391 

1 11 21 31 41 51 

55 | | I I I I 

AGCGGGGTTG TCTATTAACT TGTTCAAAAA GTATCAGGAG TTGTCAAGGC AGAGAAGAGA 60 

GTGTTTGCAA AAG GGGG AAA GTAGTTTGCT GCCTCTTTAA GACTAGGACT GAGAGAAAGA 120 

AGAGGAGAGA GAAAGAAAGG GAGAGAAGTT TGAGCCCCAG GCTTAAGCCT TTCCAAAAAA 180 

TAATAATAAC AATC AT CGGC GGCGGCAGGA TCGGCCAGAG GAGGAGGGAA GCGCTTTTTT 240 

60 TGATCCTGAT TCCAGTTTGC CTCTCTCTTT TTTTCCCCCA AATTATTCTT CGCCTGATTT 300 

TCCTCGCGGA GCCCTGCGCT CCCGACACCC CCGCCCGCCT CCCCTCCTCC TCTCCCCCCG 3 60 

CCCGCGGGCC CCCCAAAGTC CCGGCCGGGC CGAGGGT CGG CGGCCGCCGG CGGGCCGGGC 420 

CCGCGCACAG CGCCCGCATG TACAACATGA TGGAGACGGA GCTGAAGCCG CCGGGCCCGC 480 

AGCAAACTTC GGGGGGCGGC GGCGGCAACT CCACCGCGGC GGCGGCCGGC GGCAACCAGA 540 

65 AAAACAGCCC GGACCGCGTC AAGCGGCCCA TGAATGCCTT CATGGTGTGG TCCCGCGGGC 60 0 

AGCGGCGCAA GATGGCCCAG GAGAACCCCA AGATGCACAA CTCGGAGATC AGCAAGCGCC 660 

TGGGCGCCGA GTGGAAACTT TTGTCGGAGA CGGAGAAGCG GCCGTTCATC GACGAGGCTA 72 0 

AGCGGCTGCG AGCGCTGCAC ATGAAGGAGC ACCCGGATTA TAAATACCGG CCCCGGCGGA 780 

AAACCAAGAC GCTCATGAAG AAGGATAAGT ACACGCTGCC CGGCGGGCTG CTGGCCCCCG 840 

70 GCGGCAATAG CATGGCGAGC GGGGT CGGGG TGGGCGCCGG CCTGGGCGCG GGCGTGAACC 900 

AGCGCATGGA CAGTTACGCG CACATGAACG GCTGGAGCAA CGGCAGCTAC AGCATGATGC 960 

AGGACCAGCT GGGCTACCCG CAGCACCCGG GCCTCAATGC GCACGGCGCA GCGCAGATGC 1020 

AGCCCATGCA CCGCTACGAC GTGAGCGCCC TGCAGTACAA CTCCATGACC AGCTCGCAGA 1080 

CCTACATGAA CGGCTCGCCC ACCTACAGCA TGTCCTACTC GCAGCAGGGC ACCCCTGGCA 1140 

75 TGGCTCTTGG CTCCATGGGT TCGGTGGTCA AGT CCGAGGC CAGCTCCAGC CCCCCTGTGG 1200 

TTACCTCTTC CTCCCACTCC AGGGCGCCCT GCCAGGCCGG GGACCTCCGG GACATGATCA 1260 

GCATGTATCT CCCCGGCGCC GAGGTGCCGG AACCCGCCGC GCCCAGCAGA CTTCACATGT 132 0 

CCCAGCACTA CCAGAGCGGC CCGGTGCCCG GCACGGCCAT TAACGGCACA CTGCCCCTCT 1380 

CACACATGTG AGGGCCGGAC AGCGAACTGG AGGGGGGAGA AATTTTCAAA GAAAAACGAG 1440 

80 GGAAATGGGA GGGGTGCAAA AGAGGAGAGT AAGAAACAGC ATGGAGAAAA CCCGGTACGC 1500 

TCAAAAAAAA AAAAAAAAAA AAAATCCCAT CACCCACAGC AAATGACAGC TGCAAAAGAG 1560 

AACACCAAT C CCATCCACAC TCACGCAAAA ACCGCGATGC CGACAAGAAA ACTTTTATGA 162 0 

GAGAGATCCT GGACTTCTTT TKGGGGGACT ATTTTTGTAC AGAGAAAACC TGGGGAGGGT 1680 

GGGGAGGGCG GGGGAATGGA CCTTGTATAG ATCTGGAGGA AAGAAAGCTA CGAAAAACTT 1740 

85 TTTAAAAGTT CTAGTGGTAC GGTAGGAGCT TTGCAGGAAG TTTGCAAAAG TCTTTACCAA 1800 

TAATATTTAG AGCTAGTCTC CAAGCGACGA AAAAAATGTT TTAATATTTG CAAGCAACTT 1860 

TTGTACAGTA TTTATCGAGA TAAACATGGC AATCAAAATG TCCATTGTTT ATAAGCTGAG 1920 



189 



10 



15 
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AATTTGCCAA TATTTTTCAA GGAGAGGCTT CTTG CTGAAT TTTGATT CTG CAGCTGAAAT 1980 

TTAGGACAGT TGCAAACGTG AAAAGAAGAA AATTATTCAA ATTTGGACAT TTTAATTGTT 2040 

TAAAAATTGT ACAAAAGGAA AAAATTAGAA TAAGTACTGG CGAACCATCT CTGTGGTCTT 2100 

GTTTAAAAAG GGCAAAAGTT TTAGACTGTA CTAAATTTTA TAACTTACTG TTAAAAGCAA 2160 

AAATGGCCAT GCAGGTTGAC ACCGTTGGTA ATTTATAATA GCTTTTGTTC GATCCCAACT 2220 

TTCCATTTTG TTCAGATAAA AAAAACCATG AAATTACTGT GTTTGAAATA TTTTCTTATG 22 80 

GTTTGTAATA TTTCTGTAAA TTTATTGTGA TATTTTAAGG TTTTCCCCCC TTTATTTTCC 2 340 

GTAGTTGTAT TTTAAAAGAT TCGGCTCTGT ATTATTTGAA TCAGTCTGCC GAGAATCCAT 2400 

GTATATATTT GAACTAATAT CATCCTTATA ACAGGTACAT TTTCAACTTA AGTTTTTACT 24 60 

CCATTATGCA CAGTTTGAGA TAAATAAATT TTTGAAATAT GGACACTGAA AAAAAAAAAA 2520 

AAAAAAACAA AACAAAAAAA CAAAAAACAA AAACAGAAAA AACAAAAAAA AAAACAAAAC 2580 

CACAACACAA AAACAAAAAA AAAAAAAAGA AACAAACACA CAACACAACA CAACACAAAA 2 640 
CCACAACACA AACAACAACA CACAGAGGG 

Seq ID NO: 4 Protein sequence: 
Protein Accession #:CAA83435.1 



PCT/US02/12476 



20 



25 



MYNMMETELK 
QENPKMHNSE 
KKDKYTLPGG 
PQHPGLNAHG 
GSWKSEASS 
GPVPGTAING 



11 

i 

PPGPQQTSGG 
ISKRLGAEWK 
IiLAPGGNSMA 
AAQMQPMHRY 
SPPWTSSSH 
TLPLSHM 



21 
1 

GGGNS TAAAA 
LLSETEKRPF 
SGVGVGAGLG 
DVSALQYNSM 
SRAPCQAGDL 



31 



41 



GGNQKNSPDR VKRPMNAFMV 
IDEAKRLRAL HMKEHPDYKY 
AGVNQRMDSY AHMNGWSNGS 
TSSQTYMNGS PTYSMSYSQQ 
RDMISMYLPG AEVPEPAAPS 



51 
I 

WSRGQRRKMA 
RPRRKTKTLM 
YSMMQDQLGY 
GTPGMALGSM 
RLHMSQHYQS 



60 
120 
180 
240 
300 



30 
35 
40 
45 



Seq ID NO: 5 DNA sequence 
Nucleic Acid Accession #: U91618 
Coding sequence: 2 9-541 



CGGACTTGGC 
CATGCTACTC 
AGCATTAGAA 
TCCCTCTTGG 
AGCTGAGGAA 
TGCTTTAGAT 
TCACAGCAGG 
TGACAAAAAT 
GCTGTATGAG 
AGAGAATAAA 
ATTATATTTG 
ATTGAATGTG 
TCTTCAAAAA 



11 

I 

TTGTTAGAAG 
CTGGCTTTCA 
GCAGATTTCT 
AAGATGACTC 
ACAGGAGAAG 
GGCTTTAGCT 
GCTTTTCAAC 
GGAAAGGAAG 
AATAAACCCA 
TCATTTATTT 
TGTGAAAATG 
TTTTTCTGCA 
AAAAAAAAAA 



21 

I 

GCTGAAAGAT 
GCTCCTGGAG 
TGACCAATAT 
TGCTAAATGT 
TTCATGAAGA 
TGGAAGCAAT 
ACTGGGAGTT 
AAGT CAT AAA 
GAAGACCCTA 
ACATGTGATT 
TGACAAACAC 
CTAATAGAAA 
AAATGGGGCC 



31 

[ 

GATGGCAGGA 
TCTGTGCTCA 
GCATACATCA 
TTGCAGTCTT 
GGAGCTTGTT 
GTTGACAATA 
AATCCAGGAA 
GAGAAAAATT 
CAT ACT CAAA 
GTGATTCATC 
ACTTATCTGT 
TTAGACTAAG 
GCAATT 



41 
I 

ATGAAAATCC 
GATT CAGAAG 
AAGATTAGTA 
GTAAATAATT 
GCAAGAAGGA 
TACCAGCTCC 
GATATTCTTG 
CCTTATATTC 
AGAGATTCTT 
ATCCCTTAAT 
CTCTTCTACA 
TGTTTTCAAA 



51 

i 

AGCTTGTATG 
AGGAAATGAA 
AAGCACATGT 
TGAACAGCCC 
AACTTCCTAC 
ACAAAATCTG 
ATACTGGAAA 
TGAAACGGCA 
ACTATTACTG 
TAAATATCAA 
ATTGTGGTTT 
TAAATCTAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



Seq ID NO: 6 Protein sequence: 
50 Protein Accession #: AAB50564 

1 11 21 31 41 51 

I I I I I l 

MMAGMKIQLV CMLLLAFSSW SLCSDSEEEM KALEAD FIiTN MHTSKISKAH VPSWKMTLLN 60 

55 VCSLVNNLNS PAEETGEVHE EELVARRKLP TALDGFSLEA MLTIYQLHKI CHSRAFQHWE 120 
LIQEDILDTG NDKNGKEEVI KRKIPYILKR QLYENKPRRP YILKRDSYYY 

Seq ID NO: 7 DNA sequence 
Nucleic Acid Accession #: NM_006536.2 
60 Coding sequence: 109-2 940 

1 11 21 31 41 51 

i I I I I I 

ACCTAAAACC TTGCAAGTTC AGGAAGAAAC CATCTGCATC CATATTGAAA ACCTGACACA 60 

65 ATGTATGCAG CAGGCTCAGT GTGAGTGAAC TGGAGGCTTC TCTACAACAT GACCCAAAGG 120 

AGCATTGCAG GTCCTATTTG CAACCTGAAG TTTGTGACTC TCCTGGTTGC CTTAAGTTCA 180 

GAACTCCCAT TCCTGGGAGC TGGAGTACAG CTTCAAGACA ATGGGTATAA TGGATTGCTC 240 

ATTGCAATTA ATCCTCAGGT ACCTGAGAAT CAGAACCTCA TCTCAAACAT TAAGGAAATG 300 

ATAACTGAAG CTTCATTTTA CCTATTTAAT GCTACCAAGA GAAGAGTATT TTTCAGAAAT 360 

70 ATAAAGATTT TAATACCTGC CACATG G AAA GCTAATAATA ACAGCAAAAT AAAACAAGAA 420 

TCATATGAAA AGGCAAATGT CATAGTGACT GACTGGTATG GGG CACATGG AGATGATCCA 480 

TACACCCTAC AATACAGAGG GTGTGGAAAA GAGGGAAAAT ACATTCATTT CACACCTAAT 540 

TTCCTACTGA ATGATAACTT AACAGCTGGC TACGGATCAC GAGGCCGAGT GTTTGTCCAT 600 

GAATGGGCCC ACCTCCGTTG GGGTGTGTTC GATGAGTATA ACAATGACAA ACCTTTCTAC 660 

75 ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGCATTTTT 720 

GTGTGTGAAA AAGGTCCTTG CCCCCAAGAA AACTGTATTA TTAGTAAGCT TTTTAAAGAA 780 

GGATGCACCT TTATCTACAA TAGCACCCAA AATGCAACTG CAT C AAT AAT GTTCATGCAA 840 

AGTTTATCTT CTGTGGTTGA ATTTTGTAAT GCAAGTACCC ACAACCAAGA AGCACCAAAC 900 

CTACAGAACC AGATGTGCAG CCTCAGAAGT GCATGGGATG T AAT CACAG A CTCTGCTGAC 960 

80 TTTCACCACA GCTTTCCCAT GAATGGGACT GAGCTTCCAC CTCCTCCCAC ATTCTCGCTT 1020 

GTACAGGCTG GTGACAAAGT GGTCTGTTTA GTGCTGGATG TGTCCAGCAA GATGGCAGAG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAGCC GCAGAATTTT ATTTGATGCA GATTGTTGAA 1140 

ATTCATACCT TCGTGGGCAT TGCCAGTTTC GACAGCAAAG GAGAGATCAG AGCCCAGCTA 1200 

CACCAAATTA ACAGCAATGA TGATCGAAAG TTGCTGGTTT CATATCTGCC CACCACTGTA 1260 

85 TCAGCTAAAA CAGAC AT CAG CATTTGTTCA GGGCTTAAGA AAGGATTTGA GGTGGTTGAA 1320 

AAACTGAATG GAAAAGCTTA TGGCTCTGTG ATGATATTAG TGACCAGCGG AGATGATAAG 1380 

CTTCTTGGCA ATTGCTTACC CACTGTGCTC AGCAGTGGTT CAACAATTCA CTCCATTGCC 1440 
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CTGGGTTCAT CTGCAGCCCC AAATCTGGAG GAATTATCAC GTCTTACAGG AGGTTTAAAG 1500 

TTCTTTGTTC CAGATATATC AAACTCCAAT AGCATGATTG ATGCTTTCAG TAGAATTTCC 1560 

TCTGGAACTG GAGACATTTT CCAGCAACAT ATTCAGCTTG AAAGTACAGG TGAAAATGTC 1620 

AAACCTCACC AT CAATTGAA AAACACAGTG ACTGTGGATA ATACTGTGGG CAACGACACT 1680 

5 ATGTTTCTAG TTACGTGGCA GGCCAGTGGT CCTCCTGAGA TTATATTATT TGAT CCTGAT 1740 

GGACGAAAAT ACTACACAAA TAATTTTATC ACCAATCTAA CTTTTCGGAC AGCTAGTCTT 1800 

TGGATTCCAG GAACAGCTAA GCCTGGGCAC TGGACTTACA CCCTGAACAA TACCCATCAT 1860 

TCTCTGCAAG CCCTGAAAGT GACAGTGACC TCTCGCGCCT CCAACTCAGC TGTGCCCCCA 1920 

GCCACTGTGG AAGCCTTTGT GGAAAGAGAC AGCCTCCATT TTCCTCATCC TGTGATGATT 1980 

10 TATGCCAATG TGAAACAGGG ATTTTATCCC AT T CTTAATG CCACTGTCAC TGCCACAGTT 2040 

GAGCCAGAGA CTGGAGATCC TGTTACGCTG AGACTCCTTG ATGATGGAGC AGGTGCTGAT 2100 

GTTATAAAAA ATGATGGAAT TTACTCGAGG TATTTTTTCT CCTTTGCTGC AAATGGTAGA 2160 

TATAGCTTGA AAGTGCATGT CAATCACTCT CCCAGCATAA GCACCCCAAC CCACTCTATT 2220 

CCAGGGAGTC ATGCTATGTA TGTACCAGGT TACACAG CAA ACGGTAATAT TCAGATGAAT 22 80 

15 GCTCCAAGGA AATCAGTAGG CAGAAATGAG GAGGAGCGAA AGTGGGGCTT TAGCCGAGTC 2340 

AGCTCAGGAG GCTCCTTTTC AGTGCTGGGA GTTCCAGCTG GCCCCCACCC TGATGTGTTT 2400 

CCACCATGCA AAATTATTGA CCTGGAAGCT G T AAAAGT AG AAGAGGAATT GACCCTATCT 24 60 

TGGACAGCAC CTGGAGAAGA CTTTG AT CAG GGCCAGGCTA CAAGCTATGA AATAAGAATG 2520 

AGT AAAAGT C TACAGAATAT CCAAGATGAC TTTAACAATG CTATTTTAGT AAATACATCA 2580 

20 AAGCGAAATC CTCAGCAAGC TGGCATCAGG GAGATATTTA CGTTCTCACC CCAGATTTCC 2 640 

ACGAATGGAC CTGAACATCA GCCAAATGGA GAAACACATG AAAGCCACAG AATTTATGTT 2 700 

GCAATACGAG CAATGGATAG GAACTCCTTA CAGTCTGCTG TATCTAACAT TGCCCAGGCG 2760 

CCTGTGTTTA TTCCCCCCAA TTCTGATCCT GTACCTGCCA GAG ATTAT CT TATATTGAAA 2820 

GGAQTTTTAA CAGCAATGGG TTTGATAGGA ATCATTTGCC TTATTATAGT TGTGACACAT 2880 

25 CATACTTTAA GCAGGAAAAA GAGAGCAGAC AAGAAAGAGA ATGGAACAAA ATTATTATAA 2 940 

ATAAATATCC AAAGTGT CTT CCTTCTTAGA TATAAGACCC ATGGCCTTCG ACTACAAAAA 3 00 0 

CATACTAACA AAGTCAAATT AACATCAAAA CTGTATTAAA ATGCATTGAG TTTTTGTACA 3 060 

ATACAGATAA GATTTTTACA TGGTAGATCA ACAATTCTTT TTGGGGGTAG ATTAGAAAAC 3120 

CCTTACACTT TGGCTATGAA CAAATAATAA AAATTATTCT TTAAAGTAAT GTCTTTAAAG 3180 

30 GCAAAGGGAA GGGTAAAGTC GGACCAGTGT CAAGGAAAGT TTGTTTTATT GAGGTGGAAA 3240 

AATAGCCCCA AGCAGAGAAA AGGAGGGTAG GTCTGCATTA TAACTGTCTG TGTGAAGCAA 3300 

TCATTTAGTT ACTTTGATTA ATTTTTCTTT TCTCCTTATC TGTGCAGTAC AGGTTGCTTG 3360 

TTTACATGAA GAT CATGCTA TATTTTATAT ATGTAGCCCC TAATGCAAAG CTCTTTACCT 3420 

CTTGCTATTT TGTTATATAT ATTTCAGATG ACATCTCCCT GCTAATGCTC AGAGATCTTT 3480 

35 TTTCACTGTA AGAGGTAACC TTTAACAATA TGGGTATTAC CTTTGTCTCT TCATACCGGT 3540 

TTTATGACAA AGGTCTATTG AATTTATTTG TNTGTAAGTT TCTACTCCCA TCAAAGCAGC 3600 

TTTCTAAGTT TATTGCCTTG GGTTATTATG GAATGATAGT TATAGCCCCN TATAATGCCT 3660 

TACCTAGGAA A 

40 Seq ID NO: 8 Protein sequence: 

Protein Accession #: NP_0 06527.1 

1 11 21 31 41 51 

AC I I I I I I 

4 J MTQRSIAGPI CNLKFVTLLV AIiSSELPFLG AGVQLQDNGY NGLLIAINPQ VPENQNLISN 60 

IKEMITEASF YLFNATKRRV FFRNIKILIP ATWKANNNSK IKQESYEKAN VIVTDWYGAH 120 

GDDPYTLQYR GCGKEGKYIH FTPNFLLNDN LTAGYGSRGR VFVHEWAHLR WGVFDEYNND 180 

KPFYINGQNQ IKVTRCSSDI TGI FVCEKGP CPQENCIISK LFKEGCTFIY NSTQNATASI 240 

MFMQSLSSW EFCNASTHNQ EAPNLQNQMC SLRSAWDVIT DSADFHHSFP MNGTELPPPP 300 

50 TFSLVQAGDK WCLVLDVSS KMAEADRLLQ LQQAAEFYLM QIVEIHTFVG IASFDSKGEI 360 

RAQLHQINSN DDRKLLVSYL PTTVSAKTDI SICSGLKKGF EWEKLNGKA YGSVMILVTS 42 0 

GDDKLLGNCL PTVLSSGSTI HSIALGSSAA PNIiEELSRLT GGLKFFVPDI SNSNSMIDAF 480 

SRISSGTGDI FQQHIQLEST GENVKPHHQL KNTVTVDNTV GNDTMFLVTW QASGPPEIIL 540 

FDPDGRKYYT NNFITNLTFR TASIiWIPGTA KPGHWTYTLN NTHHSLQALK VTVTSRASNS 600 

55 AVPPATVEAF VERDSLHFPH PVMIYANVKQ GFYPILNATV TATVEPETGD PVTLRLLDDG 660 

AGADVIKNDG IYSRYFFSFA ANGRYSLKVH VNHSPSISTP AHS I PGSHAM YVPGYTANGN 720 

IQMNAPRKSV GRNEEERKWG FSRVSSGGSF SVLGVPAGPH PDVFPPCKII DLEAVKVEEE 780 

LTLSWTAPGE DFDQGQATSY EIRMSKSLQN IQDDFNNAIL VNTSKRNPQQ AGIREIFTFS 840 

PQISTNGPEH QPNGETHESH RIYVAIRAMD RNSLQSAVSN IAQAPLFIPP NSDPVPARDY 900 
60 LILKGVLTAM GLIGIICLII WTHHTLSRK KRADKKENGT KLL 



Seq ID NO: 9 DNA sequence 
Nucleic Acid Accession #: Eos sequence 
65 Coding sequence: 336-632 

1 11 21 31 41 51 

I I I I I I 

CTCCCCTCAC CCCGGTCCAG GATGCCCAGT CCCCACGACA CCTCCCACTT CCCACTGTGG 60 

70 CCTGGGTGGG CTCAGGGGCT GCCCTTGACC TGGCCTAGAG CCCTCCCCCA GCTGGTGGTG 120 

GAGCTGGCAC TCTCTGGGAG GGAGGGGGCT GGGAGGGAAT GAGTGGGAAT GGCAAGAGGC 180 

CAGGGTTTGG TGGGATCAGG TTGAGGCAGG TTTGGTTTCC TTAAAATGCC AAGTTGGGGG 240 

CCAGTGGGGC CCACATATAA ATCCTCACCC TGGGAGCCTG GCTGCCTTGC TCTCCTTCCT 300 

GGGTCTGTCT CTGCCACCTG GTCTGCCACA GATCCATGAT GTGCAGTTCT CTGGAGCAGG 360 

75 CGCTGGCTGT GCTGGTCACT ACCTTCCACA AGTACTCCTG CCAAGAGGGC GACAAGTTCA 420 

AGCTGAGTAA GGGGGAAATG AAGGAACTT C TGCACAAGGA GCTGCCCAGC TTTGTGGGGG 480 

AGAAAGTGGA TGAGGAGGGG CTGAAGAAGC TGATGGGCAG CCTGGATGAG AACAGTGACC 540 

AG CAGGTGG A CTTCCAGGAG TATGCTGTTT TCCTGGCACT CATCACTGTC ATGTGCAATG 600 

ACTTCTTCCA GGGCTGCCCA GACCGACCCT GAAGCAGAAC TCTTGACTTC CTGCCATGGA 660 

80 TCTCTTGGGC CCAGGACTGT TGATGCCTTT GAGTTTTGTA TTCAATAAAC TTTTTTTGTC 720 

TGTTGATAAT ATTTTAATTG CTCAGTGATG TTCCATAACC CGGCTGGCTC AGCTGGAGTG 780 

CTGGGAGATG AGGGCCTCCT GGATCCTGCT CCCTTCTGGG CTCTGACTCT CCTGGAAATC 840 

TCTCCAAGGC CAGAGCTATG CTTTAGGTCT CAATTTTGGA ATTTCAAACA CCAGCAAAAA 900 

ATTGGAAATC GAGATAGGTT GCTGACTTTT ATTTTGTCAA ATAAAGATAT TAAAAAAGGC 960 

85 AAATACCA 

Seq ID NO: 10 Protein sequence: 
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Protein Accession #: NP__005969.1 



PCT/US02/12476 



1 11 21 31 41 51 

* I I I 1 I I 

w> MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGEK VDEEGLKKLM 60 
GSLDENSDQQ VDFQEYAVFL ALITVMCNDF FQGCPDRP 

Seq ID NO: 11 DNA sequence 
10 Nucleic Acid Accession #: Eos sequence 
Coding sequence: 336-626 

1 11 21 31 41 51 

- - I I I I I I 

ID CTCCCCTCAC CCCGGTCCAG GATGCCCAGT CCCCACGACA CCTCCCACTT CCCACTGTGG 60 

CCTGGGTGGG CTCAGGGGCT GCCCTTGACC TGGCCTAGAG CCCTCCCCCA GCTGGTGGTG 12 0 

GAGCTGGCAC TCTCTGGGAG GGAGGGGGCT GGGAGGGAAT GAGTGGGAAT GGCAAGAGGC 180 

CAGGGTTTGG TGGGATCAGG TTGAGGCAGG TTTGGTTTCC TTAAAATGCC AAGTTGGGGG 240 

CCAGTGGGGC CCACATATAA ATCCTCACCC TGGGAGCCTG GCTGCCTTGC TCTCCTTCCT 30 0 

20 GGGTCTGTCT CTGCCACCTG GTCTGCCACA GATCCATGAT GTGCAGTTCT CTGGAGCAGG 3 60 

CGCTGGCTGT GCTGGTCACT ACCTTCCACA AGTACTCCTG CCAAGAGGGC GACAAGTTCA 42 0 

AGCTGAGTAA GGGGGAAATG AAGGAACTTC TGCACAAGGA GCTGCCCAGC TTTGTGGGGC 480 

ATTCCAGAGA ACCATGTGCT GTGAGGGCCT TCCGAGTCCA TCTGTTTAAT CCTGT CATTG 540 

GAGACTTGAG AAACCAGAGC CCAGAAGGGA AAAGTGATTG TCCCAAGATC ACACAGCACT 600 

25 GGAGAAAGTG GATGAGGAGG GGCTGAAGAA GCTGATGGGC AGCCTGGATG AGAACAGTGA 660 

CCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA CTCATCACTG TCATGTGCAA 72 0 

TGACTTCTTC CAGGGCTGCC CAGACCGACC CTGAAGCAGA ACTCTTGACT TCCTGCCATG 780 

GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG TATTCAATAA ACTTTTTTTG 840 

TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA CCCGGCTGGC TCAGCTGGAG 900 

30 TGCTGGGAGA TGAGGGCCTC CTGGAT CCTG CTCCCTTCTG GGCTCTGACT CTCCTGGAAA 960 

TCTCTCCAAG GCCAGAGCTA TGCTTTAGGT CTCAATTTTG GAATTTCAAA CACCAGCAAA 1020 

AAATTGGAAA TCGAGATAGG TTGCTGACTT TTATTTTGTC AAATAAAGAT ATTAAAAAAG 1080 
GCAAATACCA 

35 Seq ID NO: 12 Protein sequence: 

Protein Accession # : Eos sequence 

1 11 21 31 41 51 

At\ 1 1 1 1 1 1 

40 MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGHS RE P CAVRAFR 60 
VHLFNPVIGD LRNQSPEGKS DCPKITQHWR KWMRRG 



Seq ID NO: 13 DNA sequence 
45 Nucleic Acid Accession #: Eos sequence 
Coding sequence: 58-354 

1 11 21 31 41 51 

CA I 1 1 i I I 

DO GTGAGCTCAC CATGTGGGGG TGAGGCTGAG AGAAAACAAG TACACAGCCA CAGATCCATG 60 

ATGTGCAGTT CTCTGGAGCA GGCGCTGGCT GTGCTGGTCA CTACCTTCCA CAAGTACTCC 120 
TGCCAAGAGG GCGACAAGTT CAAGCTGAGT AAGGGGGAAA TGAAGGAACT TCTGCACAAG 180 
GAGCTGCCCA GCTTTGTGGG GGAGAAAGTG GATGAGGAGG GGCTGAAGAA GCTGATGGGC 240 
AGCCTGGATG AGAACAGTGA CCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA 300 

55 CTCATCACTG TCATGTGCAA TGACTTCTTC CAGGGCTGCC CAGACCGACC CTGAAGCAGA 360 

ACTCTTGACT TCCTGCCATG GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG 420 
TATTCAATAA ACTTTTTTTG TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA 480 
CCCGGCTGGC TCAGCTGGAG TGCTGGGAGA TGAGGGCCTC CTGGAT CCTG CTCCCTTCTG 540 
GGCTCTGACT CTCCTGGAAA TCTCTCCAAG GCCAGAGCTA TGCTTTAGGT CTCAATTTTG 600 

60 GAATTTCAAA CACCAGCAAA AAATTGGAAA TCGAGATAGG TTGCTGACTT TTATTTTGTC 660 

AAATAAAGAT ATTAAAAAAG GCAAATACCA 

Seq ID NO: 14 Protein sequence: 
Protein Accession #: NP_005969.1 
65 1 11 21 31 41 51 

I I I I 1 I 

MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGEK VDEEGLKKLM 60 
GSLDENSDQQ VDFQEYAVFL ALITVMCNDF FQGCPDRP 

70 

Seq ID NO: 15 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 62-3 58 

75 1 11 21 31 41 51 

I I 1 1 I I 

GGAGGGTGTG CCGCTGAGTC ACTGCCTGGG CATCTGGGCC TGGAACCTCG GCCACAGATC 60 

CATGATGTGC AGTTCTCTGG AGCAGGCGCT GGCTGTGCTG GTCACTACCT T CCACAAGT A 12 0 

CTCCTGCCAA GAGGGCGACA AGTTCAAGCT GAGTAAGGGG GAAATGAAGG AACTTCTGCA 180 

80 CAAGGAGCTG CCCAGCTTTG TGGGGGAGAA AGTGGATGAG GAGGGGCTGA AGAAGCTGAT 240 

GGGCAGCCTG GATGAGAACA GTGACCAGCA GGTGGACTTC CAGGAGTATG CTGTTTTCCT 300 

GGCACTCATC ACTGTCATGT GCAATGACTT CTTCCAGGGC TGCCCAGACC GACCCTGAAG 360 

CAGAACTCTT GACTTCCTGC CATGGATCTC TTGGGCCCAG GACTGTTGAT GCCTTTGAGT 420 

TTTGTATTCA ATAAACTTTT TTTGTCTGTT GATAATATTT TAATTGCTCA GTGATGTTCC 480 

85 ATAACCCGGC TGGCTCAGCT GGAGTGCTGG GAGATGAGGG CCTCCTGGAT CCTGCTCCCT 540 

TCTGGGCTCT GACTCTCCTG GAAATCTCTC CAAGGCCAGA GCTATGCTTT AGGTCTCAAT 600 

TTTGGAATTT CAAACACCAG CAAAAAATTG GAAATCGAGA TAGGTTGCTG ACTTTTATTT 660 
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TGTCAAATAA AGATATTAAA AAAGGCAAAT ACCA 

Seq ID NO: 16 Protein sequence: 
Protein Accession #: NP_005 969.1 

1 11 21 31 41 51 

! 1 1 I 1 I 

MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGEK VDEEGLKKLM 60 
GSLDENSDQQ VDFQEYAVFL ALITVMCNDF FQGCPDRP 



Seq ID NO: 17 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 93 9-2372 

1 11 21 31 41 51 

I I I I I I 

AAGACGGATT CTCAGACAAG GCTTGCAAAT GCCCCGCAGC CAT CAT T T AA CTGCACCCGC 60 
AGAATAGTTA CGGTTTGTCA CCCGACCCTC CCGGATCGCC TAATTTGTCC CTAGTGAGAC 120 
CCCGAGGCTC TGCCCGCGCC TGGCTTCTTC GTAGCTGGAT GCATATCGTG CTCCGGGCAG 180 
CGCGGGCGCA GGGCACGCGT TCGCGCACAC CCTAGCACAC ATGAACACGC GCAAGAGCTG 240 
AACCAAGCAC GGTTTCCATT TCAAAAAGGG AGACAGCCTC TACCGCGATT GTAGAAGAGA 300 
CTGTGGTGTG AATTAGGGAC CGGGAGGCGT CGAACGGAGG AACGGTTCAT CTTAGAGACT 360 
AATTTTCTGG AGTTTCTGCC CCTGCTCTGC GTCAGCCCTC ACGTCACTTC GCCAGCAGTA 420 
GCAGAGGCGG CGGCGGCGGC TCCCGGAATT GGGTTGGAGC AGGAGCCTCG CTGGCTGCTT 480 
CGCTCGCGCT CTACGCGCTC AGTCCCCGGC GGTAGCAGGA GCCTGGACCC AGGCGCCGCC 540 
GGCGGGCGTG AGGCGCCGGA GCCCGGCCTC GAGGTGCATA CCGGACCCCC ATTCGCATCT 600 
AACAAGGAAT CTGCGCCCCA GAGAGTCCCG GGAGCGCCGC CGGTCGGTGC CCGGCGCGCC 660 
GGGCCATGCA GCGACGGCCG CCGCGGAGCT CCGAGCAGCG GTAGCGCCCC CCTGTAAAGC 720 
GGTTCGCTAT GCCGGGGCCA CTGTGAACCC TGCCGCCTGC CGGAACACT C TTCGCTCCGG 780 
ACCAGCTCAG CCTCTGATAA GCTGGACTCG GCACGCCCGC AACAAGCACC GAGGAGTTAA 840 
GAGAGCCGCA AGCGCAGGGA AGGCCTCCCC GCACGGGTGG GGGAAAGCGG CCGGTGCAGC 900 
GCGGGGACAG GCACTCGGGC TGGCACTGGC TGCTAGGGAT GTCGTCCTGG ATAAGGTGGC 960 

ATGGACCCGC CATGGCGCGG CTCTGGGGCT TCTGCTGGCT GGTTGTGGGC TTCTGGAGGG 102 0 

CCGCTTTCGC CTGTCCCACG TCCTGCAAAT GCAGTGCCTC TCGGAT CTGG TGCAGCGACC 1080 

CTTCTCCTGG CATCGTGGCA TTTCCGAGAT TGGAGCCTAA CAGTGTAGAT CCTGAGAACA 1140 

TCACCGAAAT TTTCATCGCA AACCAGAAAA GGTTAGAAAT CATCAACGAA GATGATGTTG 1200 

AAGCTTATGT GGGACTGAGA AATCTGACAA TTGTGGATTC TGGATTAAAA TTTGTGGCTC 1260 

ATAAAGCATT TCTGAAAAAC AGCAACCTGC AGCACATCAA TTTTACCCGA AACAAACTGA 132 0 

CGAGTTTGTC TAGGAAACAT TTCCGTCACC TTGACTTGTC TGAACTGATC CTGGTGGGCA 1380 

ATCCATTTAC ATGCTCCTGT GACATTATGT GGATCAAGAC TCTCCAAGAG GCTAAATCCA 1440 

GTCCAGACAC TCAGGATTTG TACTGCCTGA ATGAAAGCAG CAAGAATATT CCCCTGGCAA 1500 

ACCTGCAGAT ACCCAATTGT GGTTTGCCAT CTGCAAATCT GGCCGCACCT AACCTCACTG 1560 

TGGAGGAAGG AAAGTCTATC ACATTATCCT GTAGTGTGGC AGGTGATCCG GTTCCTAATA 162 0 

TGTAT TGGGA TGTTGGTAAC CTGGTTTCCA AACATATGAA TGAAACAAGC CACACACAGG 1680 

GCTCCTTAAG GATAACTAAC ATTTCATCCG ATGACAGTGG GAAGCAGATC TCTTGTGTGG 1740 

CGGAAAATCT TGTAGGAGAA GATCAAGATT CTGTCAACCT CACTGTGCAT TTTGCACCAA 1800 

CTAT CACATT TCTCGAATCT CCAACCTCAG ACCACCACTG GTGCATTCCA TTCACTGTGA 1860 

AAGGCAACCC CAAACCAGCG CTTCAGTGGT TCTATAACGG GGCAATATTG AATG AG T CCA 1920 

AATACATCTG TACTAAAATA CATGTTACCA ATCACACGGA GTACCACGGC TGCCTCCAGC 1980 

TGGATAATCC CACTCACATG AACAATGGGG ACTACACTCT AATAGCCAAG AATGAGTATG 2040 

GGAAGGATGA GAAACAGATT TCTGCTCACT TCATGGGCTG GCCTGGAATT GACGATGGTG 2100 

CAAACCCAAA TTATCCTGAT GTAATTTATG AAGATTATGG AACTGCAGCG AATGACATCG 2160 

GGGACACCAC GAACAGAAGT AATGAAATCC CTTCCACAGA CGTCACTGAT AAAACCGGTC 222 0 

GGGAACATCT CTCGGTCTAT GCTGTGGTGG TGATTGCGTC TGTGGTGGGA TTTTGCCTTT 22 80 

TGGTAATGCT GTTTCTGCTT AAGTTGGCAA GACACTCCAA GTTTGGCATG AAAGGTTTTG 2340 

TTTTGTTTCA TAAGATCCCA CTGGATGGGT AGCTGAAATA AAGGAAAAGA CAGAGAAAGG 2400 

GGCTGTGGTG CTTGTTGGTT GATGCTGCCA TGTAAGCTGG ACTCCTGGGA CTGCTGTTGG 2460 

CTTATCCCGG GAAGTGCTGC TTATCTGGGG TTTTCTGGTA GATGTGGGCG GTGTTTGGAG 2520 

GCTGTACTAT ATGAAGCCTG CATATACTGT GAGCTGTGAT TGGGGAACAC CAATGCAGAG 2580 

GTAACTCTCA GGCAGCTAAG CAGCACCTCA AGAAAACATG TTAAATTAAT GCTTCTCTTC 2640 

TTACAGTAGT TCAAATACAA AACTGAAATG AAATCCCATT GGATTGTACT TCTCTTCTGA 2700 

AAAGTGTGCT TTTTGACCCT ACTGGACATT TATTGACTTA ATTGCTTCTG TTTATTAAAA 2760 

TTGACCTGCA AAGTTAAAAA AAAATTAAAG TTGAGAACAG GTATAAGTGC ACACTGAATA 2820 

GTCTAATCTA CATGTAACAC ATATTTTAGT GTGATTTTCT ATACTCTAAT CAGCACTGAA 2880 

TTCAGAGGGT TTGACTTTTT CATCTATAAC ACAGTGACTA AAAGAGTTAA GGGTATATAT 2940 

ACCATCACTT TGGGACTTGG TAGTATTATT AAAAGGT TAT TTCCTTCACT GTCAATAAAA 30 00 

GTCCAAATGT TTAGCTTAGG TCTGAGAGTC AAACAATGTT AAGGATTGTC TTAAAGTTCC 3060 

TTAGCCAGCA AAACAAAACA AAACAAAACA AACAAATGAA AAACGTTTAA AAAGAAGAAG 3120 

AAGAAAAAAA ACAAGAACAA GCAGCAACAG CTGTTTTGTT GGGGCTATAG ATTTAAGTTA 3180 

GGCATAGTCA ATTTCAGAAT AACTAAGAGT GGAATATATG CATATGGTGA AATTATAACC 3240 

TTGCCCTTTT TTATTTGCCC TCTGCGATCC ACCTGCTTTT TAGAAGTCTG CCGAGTGAGA 3300 

AGGCCACAGT ATCTCATGCT GTTTGCATTA CAGAACTGCA GCTTTTCTAC TCTGAAAAGG 3 360 

CCTGGGAGCA GAATGGCTGG CCTGCTGTGA GCAGGAGAGG AGATTCTAAG AAGGATAGTC 3420 

CCCCCTACAA CATACTGTCA TACTGCTGGG TTTTCATGGG TAGGAAAGCT TGTCCTGACC 3480 

CCAGCAGCAA AGAGGTGGCA GGTCGCTAAT GAATATATGC TTTATAATGT CCTTCTTCAT 3540 

TGCTGAGAGG GCAGCCTTAG AGCTGTGGAT TTCTGCATCC CCCCTGAGTC TGACCCATGG 3600 

ACACCTGTTT CATTCACTTT AG CAT C A C AG TGACCTTTGT ATGCTCTGTT CAGTCTGTGT 3 660 

CAGGCAGTAT GCTTGTCCTG AAGAGAGGTT TGGCTATCCC CACCCCACCC CACCCCACCC 3720 

TGTTCCTTTT TTATCAGGAG GACTTCAGAG CCAGGCCTGC AGCATTTTGT TTGAAAACAC 3 780 

AATCAGCTCT GACAGTTAGA CATGCACACA GACGCCATAG CTGGATTGGA AACATTGATG 3 840 

TTTTAAAAAT TTATTTTTTT TGGAAATAGT TGCACAAATG CTGCAATTTA GCTTTAAGGT 3 90 0 

T CTAT AG ATT TTTAACTAGT CCAACACAGT CAGAAACATT GTTTTGAATC CTCTGTAAAC 3 960 

CAAGGCATTA AT CT TAATAA ACCAGGATCC ATTTAGGTAC CACTTGATAT AAAAAGGATA 4020 

TCCATAATGA ATATTTTATA CTGCATCCTT TACATTAGCC ACTAAATACG TTATTGCTTG 4080 

ATGAAGACCT TT CACAGAAT CCTATGGATT GCAGCATTTC ACTTGGCTAC TTCATACCCA 4140 
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TGCCTTAAAG AGGGGCAGTT TCTCAAAAGC AGAAACATGC CGCCAGTTCT CAAGTTTTCC 4200 

TCCTAACTCC ATTTGAATGT AAGGGCAGCT GGCCCCCAAT GTGGGGAGGT CCGAACATTT 4260 

TCTGAATTCC CATTTTCTTG TTCGCGGCTA AATGACAGTT TCTGTCATTA CTTAGATTCC 4320 

GATCTTTCCC AAAGGTGTTG ATTTACAAAG AGGCCAGCTA AT AG CAG AAA TCATGACCCT 4380 

GAAAGAGAGA TGAAATTCAA GCTGTGAGCC AGGCAGGAGC TCAGTATGGC AAAGGTTCTT 4440 

GAGAATCAGC CATT TGGTAC AAAAAAGATT TTTAAAGCTT TTATGTTATA CCATGGAGCC 4500 

ATAGAAAGGC TATGGATTGT TTAAGAACTA TTTTAAAGTG TTCCAGACCC AAAAAGGAAA 4560 

AATAAAAAAA AAGGAATATT TGTACCCAAC AGCTAGAAGG ATTGCAAGGT AGATTTTTGT 4620 

TTTAAAATGG AGAGAAGTGG ACAGATAAGG C CAT T T AAT A TATCAAAGAT CAGTTGACAT 4680 
CTCCTAGGGA ATGATGAAAA CAGCAGGCTA T 

Seq ID NO: 18 Protein sequence: 
Protein Accession #: CAA53571 

1 11 21 31 41 51 

I I 1 I I 1 

MSSWIRWHGP AMARLWGFCW LWGFWRAAF ACPTSCKCSA SRIWCSDPSP GIVAFPRLEP 60 

NSVDPENITE IFIANQKRLE IINEDDVEAY VGLRNLTIVD SGLKFVAHKA FLKNSNLQHI , 12 0 

NFTRNKLTSL SRKHFRHLDL SEIiILVGNPF TCSCDIMWIK TLQEAKSSPD TQDLYCLNES 180 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNMYW DVGNLVSKHM 240 

NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHFAPTIT FLESPTSDHH 300 

WCIPFTVKGN PKPALQWFYN GAILNESKYI CTKIHVTNHT EYHGCLQLDN PTHMNNGDYT 360 

h I AKNE YGKD EKQISAHFMG WPGIDDGANP NYPDVIYEDY GTAANDIGDT TNRSNEIPST 420 
DVTDKTGREH LSVYAWVIA SWGFCLLVM L FIiLKL ARH S KFGMKGFVLF HKIPLDG 



Seq ID NO: 19 DNA sequence 

Nucleic Acid Accession #: NM_000228 

Coding sequence: 82-3600 

1 11 21 31 41 51 

I I I I l I 

GCTTTCAGGC GATCTGGAGA AAGAACGGCA GAACACACAG CAAGGAAAGG TCCTTTCTGG 60 

GGATCACCCC ATTGGCTGAA GATGAGACCA TTCTTCCTCT TGTGTTTTGC CCTGCCTGGC 120 

CTCCTGCATG CCCAACAAGC CTGCTCCCGT GGGGCCTGCT ATCCACCTGT TGGGGACCTG 180 

CTTGTTGGGA GGACCCGGTT TCTCCGAGCT TCATCTACCT GTGGACTGAC CAAGCCTGAG 240 

ACCTACTGCA CCCAGTATGG CGAGTGGCAG ATGAAATGCT GCAAGTGTGA CTCCAGGCAG 300 

CCTCACAACT ACTACAGTCA CCGAGTAGAG AATGTGGCTT CATCCTCCGG CCCCATGCGC 360 

TGGTGGCAGT CCCAGAATGA TGTGAACCCT GTCTCTCTGC AGCTGGACCT GGACAGGAGA 420 

TTCCAGCTTC AAGAAGTCAT GATGGAGTTC CAGGGGCCCA TGCCCGCCGG CATGCTGATT 480 

GAGCGCTCCT CAGACTTCGG TAAGACCTGG CGAGTGTACC AGTACCTGGC TGCCGACTGC 540 

ACCTCCACCT TCCCTCGGGT CCGCCAGGGT CGGCCTCAGA GCTGGCAGGA TGTTCGGTGC 600 

CAGTCCCTGC CTCAGAGGCC TAATGCACGC CTAAATGGGG GGAAGGTCCA ACTTAACCTT 660 

ATGGATTTAG TGTCTGGGAT TCCAGCAACT CAAAGTCAAA AAATT CAAGA GGTGGGGGAG 720 

ATCACAAACT TGAGAGT CAA TTTCACCAGG CTGGCCCCTG TGCCCCAAAG GGGCTACCAC 780 

CCTCCCAGCG CCTACTATGC TGTGTCCCAG CTCCGTCTGC AGGGGAGCTG CTTCTGTCAC 840 

GGCCATGCTG ATCGCTGCGC ACCCAAGCCT GGGGCCTCTG CAGGCCCCTC CACCGCTGTG 900 

CAGGTCCACG ATGTCTGTGT CTGCCAGCAC AACACTGCCG GCCCAAATTG TGAGCGCTGT 960 

GCACCCTTCT ACAACAACCG GCCCTGGAGA CCGGCGGAGG GCCAGGACGC CCATGAATGC 1020 

CAAAGGTGCG ACTGCAATGG GCACTCAGAG ACATGTCACT TTGACCCCGC TGTGTTTGCC 1080 

GCCAGCCAGG GGGCATATGG AGGTGTGTGT GACAATTGCC GGGACCACAC CGAAGGCAAG 1140 

AACTGTGAGC GGTGTCAGCT GCACTATTTC CGGAACCGGC GCCCGGGAGC TTCCATTCAG 1200 

GAGACCTGCA TCTCCTGCGA GTGTGATCCG GATGGGGCAG TGCCAGGGGC TCCCTGTGAC 1260 

CCAGTGACCG GGCAGTGTGT GTGCAAGGAG CATGTGCAGG GAGAGCGCTG TGACCTATGC 132 0 

AAGCCGGGCT TCACTGGACT CACCTACGCC AACCCGCAGG GCTGCCACCG CTGTGACTGC 1380 

AACATCCTGG GGTCCCGGAG GGACATGCCG TGTGACGAGG AGAGTGGGCG CTGCCTTTGT 1440 

CTGCCCAACG TGGTGGGTCC CAAATGTGAC CAGTGTGCTC CCTACCACTG GAAGCTGGCC 1500 

AGTGGCCAGG GCTGTGAACC GTGTGCCTGC GACCCGCACA ACTCCCCTCA GCCCACAGTG 1560 

CAACCAGTTC ACAGGGCAGT GCCCTGTCGG GAAGGCTTTG GTGGCCTGAT GTGCAGCGCT 1620 

GCAGCCATCC GCCAGTGTCC AGACCGGACC TATGGAGACG TGGCCACAGG ATGCCGAGCC 1680 

TGTGACTGTG ATTTCCGGGG AACAGAGGGC CCGGGCTGCG ACAAGGC AT C AGGCCGCTGC 1740 

CTCTGCCGCC CTGGCTTGAC CGGGCCCCGC TGTGACCAGT GCCAGCGAGG CTACTGCAAT 1800 

CGCTACCCGG TGTGCGTGGC CTGCCACCCT TGCTTCCAGA CCTATGATGC GGACCTCCGG 1860 

GAGCAGGCCC TGCGCTTTGG TAGACTCCGC AATGCCACCG CCAGCCTGTG GTCAGGGCCT 192 0 

GGGCTGGAGG ACCGTGGCCT GGCCTCCCGG AT CCTAGATG CAAAGAGTAA GATTGAGCAG 1980 

ATCCGAGCAG TTCTCAGCAG CCCCGCAGTC ACAGAGCAGG AGGTGGCTCA GGTGGCCAGT 2 040 

GCCATCCTCT CCCTCAGGCG AACTCTCCAG GGCCTGCAGC TGGATCTGCC CCTGGAGGAG 210 0 

GAGACGTTGT CCCTTCCGAG AGACCTGGAG AGTCTTGACA GAAGCTTCAA TGGTCTCCTT 2160 

ACTATGTATC AGAGGAAGAG GGAGCAGTTT GAAAAAATAA GCAGTGCTGA TCCTTCAGGA 222 0 

GCCTTCCGGA TGCTGAGCAC AGCCTACGAG CAGTCAGCCC AGGCTGCTCA GCAGGTCTCC 2280 

GACAGCTCGC GCCTTTTGGA CCAGCTCAGG GACAGCCGGA GAGAGGCAGA GAGGCTGGTG 2340 

CGGCAGGCGG GAGGAGGAGG AGGCACCGGC AGCCCCAAGC TTGTGGCCCT GAGGCTGGAG 2400 

ATGTCTTCGT TGCCTGACCT GACACCCACC TTCAACAAGC TCTGTGGCAA CTCCAGGCAG 2460 

ATGGCTTGCA CCCCAATATC ATGCCCTGGT GAGCTATGTC CCCAAGACAA TGGCACAGCC 2520 

TGTGGCTCCC GCTGCAGGGG TGTCCTTCCC AGGGCCGGTG GGGCCTTCTT GATGGCGGGG 2580 

CAGGTGGCTG AGCAGCTGCG GGGCTTCAAT GCCCAGCTCC AGCGGACCAG GCAGATGATT 2640 

AGGGCAGCCG AGGAATCTGC CTCACAGATT CAATCCAGTG CCCAGCGCTT GGAGACCCAG 2700 

GTGAGCGCCA GCCGCTCCCA GATGGAGGAA GATGTCAGAC GCACACGGCT CCTAATCCAG 2760 

CAGGTCCGGG ACTTCCTAAC AGACCCCGAC ACTGATGCAG CCACTATCCA GGAGGT CAGC 2820 

GAGGCCGTGC TGGCCCTGTG GCTGCCCACA GACTCAGCTA CTGTTCTGCA GAAGATGAAT 2880 

GAGATCCAGG CCATTGCAGC CAGGCTCCCC AACGTGGACT TGGTGCTGTC CCAGACCAAG 2 940 

CAGGACATTG CGCGTGCCCG CCGGTTGCAG GCTGAGGCTG AGGAAGCCAG GAGCCGAGCC 3 000 

CATGCAGTGG AGGGCCAGGT GGAAGATGTG GTTGGGAACC TGCGGCAGGG GACAGTGGCA 3060 

CTGCAGGAAG CTCAGGACAC CATGCAAGGC ACCAGCCGCT CCCTTCGGCT TATCCAGGAC 312 0 

AGGGTTGCTG AGGTTCAGCA GGTACTGCGG C CAG CAG AAA AGCTGGTGAC AAGCATGACC 3180 

AAGCAGCTGG GTGACTTCTG GACACGGATG GAGGAGCTCC GCCACCAAGC CCGGCAGCAG 3240 

GGGGCAGAGG CAGTCCAGGC CCAGCAGCTT GCGGAAGGTG CCAGCGAGCA GGCATTGAGT 3300 

GCCCAAGAGG GATTTGAGAG AATAAAACAA AAGTATGCTG AGTTGAAGGA CCGGTTGGGT 3360 
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CAGAGTT CCA TGCTGGGTGA GCAGGGTGCC CGGATCCAGA GTGTGAAGAC AGAGGCAGAG 3420 

GAGCTGTTTG GGGAGACCAT GGAGATGATG GACAGGATGA AAGACATGGA GTTGGAGCTG 3480 

CTGCGGGGCA GCCAGGCCAT CATGCTGCGC TCGGCGGACC TGACAGGACT GGAGAAGCGT 3540 

GTGGAGCAGA TCCGTGACCA CATCAATGGG CGCGTGCTCT ACTATGCCAC CTGCAAGTGA 3600 

TGCTACAGCT TCCAGCCCGT TGCCCCACTC ATCTGCCGCC TTTGCTTTTG GTTGGGGGCA 3660 

GATTGGGTTG GAATGCTTTC CATCTCCAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 3720 

GACCACCCCT GGTGTGTAGC TAGTAAGATT ACCCTGAGCT GCAGCTGAGC CTGAGCCAAT 3780 

GGGACAGTTA CACTTGACAG ACAAAGATGG TGGAGATTGG CATGCCATTG AAACTAAGAG 3840 

CTCTCAAGTC AAGGAAGCTG GGCTGGGCAG TATCCCCCGC CTTTAGTTCT CCACTGGGGA 3 900 

GGAATCCTGG ACCAAGCACA AAAACTTAAC AAAAGTGATG TAAAAATGAA AAGCCAAATA 3 960 
AAAATCTTTG G 

Seq ID NO: 20 Protein sequence: 
Protein Accession #: NP_000219 



PCT/US02/12476 



I 

MRPFFLLCFA 
EWQMKCCKCD 
MEFQGPMPAG 
NARLNGGKVQ 
VSQLRLQGSC 
PWRPAEGQDA 
HYFRNRRPGA 
TYANPQGCHR 
CACDPHNSPQ 
TEGPGCDKAS 
RLRNATASLW 
TLQGLQLDLP 
AYEQSAQAAQ 
TPTFNKLCGN 
GFNAQLQRTR 
DPDTDAAT I Q 
RLQAEAEEAR 
VLRPAEKLVT 
IKQKYAELKD 
MLRSADLTGL 



11 

I 

LPGLLHAQQA 
SRQPHNYYSH 
MLIERSSDFG 
LNLMDLVSGI 
FCHGHADRCA 
HE CQRCDCNG 
SIQETCISCE 
CDCNILGSRR 
PTVQPVHRAV 
GRCLCRPGLT 
SGPGLEDRGL 
LEEETLSLPR 
QVSDSSRLLD 
SRQMACTPIS 
QMIRAAEESA 
EVSEAVLALW 
SRAHAVEGQV 
SMTKQLGDFW 
RLGQSSMLGE 
EKRVEQIRDH 



21 

CSRGACYPPV 
RVENVASSSG 
KTWRVYQYLA 
PATQSQKIQE 
PKPGASAGPS 
HSETCHFDPA 
CDPDGAVPGA 
DMPCDEESGR 
PCREGFGGLM 
GPRCDQCQRG 
ASRILDAKSK 
DLESLDRSFN 
QLRDSRREAE 
CPGELCPQDN 
SQIQSSAQRL 
LPTDSATVLQ 
EDWGNLRQG 
TRMEELRHQA 
QGARIQSVKT 
I NGRVLY YAT 



31 
I 

GDLLVGRTRF 
PMRWWQSQND 
ADCTSTFPRV 
VGE I TNLRVN 
TAVQVHDVCV 
VFAASQGAYG 
PCDPVTGQCV 
CLCLPNWGP 
CSAAAIRQCP 
YCNRYPVCVA 
IEQIRAVLSS 
GLLTMYQRKR 
RLVRQAGGGG 
GTACGSRCRG 
ETQVSASRSQ 
KMNEIQAIAA 
TVALQEAQDT 
RQQGAEAVQA 
EAEELFGETM 
CK 



41 
I 

LRASSTCGLT 
VNPVSLQLDL 
RQGRFQSWQD 
FTRLAPVPQR 
CQHNTAGPNC 
GVCDNCRDHT 
CKEHVQGERC 
KCDQ CAP YHW 
DRTYGDVATG 
CHPCFQTYDA 
PAVTEQEVAQ 
EQFEKISSAD 
GTGSPKLVAL 
VLPRAGGAFL 
MEEDVRRTRL 
RLPNVDLVLS 
MQGTSRSLRL 
QQLAEGASEQ 
EMMDRMKDME 



51 

I 

KPETYCTQYG 
DRRFQLQEVM 
VRCQSLPQRP 
GYHPPSAYYA 
ERCAPFYNNR 
EGKNCERCQL 
DLCKPGFTGIi 
KLASGQGCEP 
CRACD CDFRG 
DLREQALRFG 
VASAILSLRR 
PSGAFRMLST 
RLEMSSLPDL 
MAGQVAEQLR 
LIQQVRDFLT 
QTKQDIARAR 
IQDRVAEVQQ 
ALSAQEGFER 
LELLRGSQAI 



Seq ID NO; 21 DNA sequence 

Nucleic Acid Accession #: NMJD03722 

Coding sequences 145-1491 



1 
I 

TCGTTGATAT 
ACAGTACTGC 
AAAGAAAGTT 
CCAGAGGT TT 
ATTGACTTGA 
AGCATGGACT 
ACGAACCTGG 
AGTCCCTATA 
CCCAGCTCCA 
CCAGGCCCGC 
TGGACGTATT 
CAGATCAAGG 
AAAAAAGCTG 
GAATTCAACG 
CATGCCCAGT 
CCACCCCAGG 
TGTGTTGGAG 
GGGCAAGTCC 
AGGAAGGCGG 
GATGGTACGA 
AAACGAAGAT 
GAAATGCTGT 
ATTGAAACGT 
CTTTCAGCCT 
GACGTCTTCT 
TCTATATTTT 
TGTGTGTGCG 
CCCAACTGCT 
TTACAAGAAA 
GAACCACTGT 
GAAAGGGGCA 
AATTCACAGG 
AAAAAAGTTG 
CCCTTTTAAT 
TACTGCTGGG 
TTTGTGAGAA 
GCTGTGTACC 
CATGAAACCC 
CTCATTTTGT 
TGTTTACCAT 
AATTTGCTTA 
CTGATACTGT 
AGACGTGTTA 



11 
I 

CAAAGACAGT 
CCTGACCCTT 
AT T ACCG AT C 
TCCAGCATAT 
ACTTTGTGGA 
GTATCCGCAT 
GGCTCCTGAA 
ACACAGACCA 
CCTTCGATGC 
ACAGTTT CG A 
CCACTGAACT 
TGATGACCCC 
AGCACGTCAC 
AGGGACAGAT 
ATGTAGAAGA 
TTGGCACTGA 
GGATGAACCG 
TGGGCCGACG 
ATGAAGATAG 
AGCGCCCGTT 
CCCCAGATGA 
TGAAGAT CAA 
ACAGGCAACA 
GCTTCAGGAA 
TTAGACATTC 
AAGTGTGTGT 
TGTGTATCTA 
CAAAGGCACA 
GGATGTTTTC 
GTTTGTCTGT 
TTAAGATGTT 
GAAGCTTTTG 
TTATTGTCTG 
GCTGGTCATG 
CAGCGAGGTG 
CTTGCATTAT 
TGCCTCTGCC 
TGGAAGACCT 
GCTTTTAATA 
TATTCAAAGC 
ATTAGAGCTT 
TCAGTGCATT 
AAATCAGCAC 



21 

I 

TGAAGGAAAT 
ACAT CCAGCG 
CACCATGTCC 
CTGGGATTTT 
TGAACCATCA 
GCAGGACTCG 
CAGCATGGAC 
CGCGCAGAAC 
TCTCTCTCCA 
CGTGTCCTTC 
GAAGAAACTC 
ACCTCCTCAG 
GGAGGTGGTG 
TGCCCCTCCT 
TCCCATCACA 
ATTCACGACA 
CCGTCCAATT 
CTGCTTTGAG 
CATCAGAAAG 
TCGT CAGAAC 
TGAACTGTTA 
AGAGTCCCTG 
GCAACAGCAG 
TGAGCTTGTG 
CAAGCCCCCA 
GTTGTATTTC 
GCCCTCATAA 
AAGCCACTAG 
TGCAGATTTT 
GAGCTTTCTG 
TATTGGAACC 
AGCAGGTCTC 
TG CATAAGT A 
TAATAATATT 
ATCATTACCA 
TTGTGTCCTC 
ACTGTATGTT 
ACTACAAAAA 
GAAAGACAAA 
TCAAAATAGA 
CTATCCCTCA 
TAGCCAGGAG 
TCCTGGACTG 



31 
I 

GAATTTTGAA 
TTTCGTAGAA 
CAGAGCACAC 
CTGGAACAGC 
GAAGATGGTG 
GACCTGAGTG 
CAGCAGATT C 
AGCGTCACGG 
TCACCCGCCA 
CAGCAGTCGA 
TACTGCCAAA 
GGAGCTGTTA 
AAGCGGTGCC 
AGTCATTTGA 
GGAAGACAGA 
GTCTTGTACA 
TTAATCATTG 
GCCCGGATCT 
CAGCAAGTTT 
ACACATGGTA 
TACTTACCAG 
GAACTCATGC 
CAGCACCAGC 
GAGCCCCGGA 
AACCGATCAG 
CATGTGTATA 
ACAGGACTTG 
TGAGAGAATC 
GTATCCTTAG 
TTGTTTCCTG 
CTTTTCTGTC 
AAACTTAAGA 
AGTTGTAGGT 
GCAAGTAGTA 
AAAGTAAT C A 
CCCTCATGTG 
GGCATCTGTT 
AACTGTTGTT 
TCCACCCCAG 
ATTTGAAGCC 
AGCCTACCTA 
ACTTACGTTT 
GAAATTAAAG 



41 

1 

ACTTCACGGT 
ACCCAGCTCA 
AGACAAATGA 
CTATATGTTC 
CGACAAACAA 
ACCCCATGTG 
AGAACGGCTC 
CGCCCTCGCC 
TCCCCTCCAA 
GCACCGCCAA 
TTGCAAAGAC 
TCCGCGCCAT 
CCAACCATGA 
TTCGAGTAGA 
GTGTGCTGGT 
ATTTCATGTG 
TTACTCTGGA 
GTGCTTGCCC 
CGGACAGTAC 
TCCAGATGAC 
TGAGGGGCCG 
AGTACCTTCC 
ACTTACTTCA 
GAGAAACTCC 
TGTACCCATA 
TGTGAGTGTG 
AAGACACTTT 
TTTTGAAGGG 
ACCGGCCATT 
GGAGGGAGGG 
TTCTTCTGTT 
TGTCTT T TT A 
GACTGAGAGA 
AGAAACGAAG 
ACTTTGTGGG 
TAGGTAGAAC 
ATGCTAAAGT 
TGGCCCCCAT 
TAATATTGCC 
CTCTCACAAA 
CCATAAAACC 
TGAGTAAGTG 
ATTGAAAGGG 



51 
I 

GTGCCACCCT 
TTTCTCTTGG 
ATTCCTCAGT 
AGTTCAGCCC 
GATTGAGATT 
GCCACAGTAC 
CTCGTCCACC 
CTACGCACAG 
CACCGACTAC 
GTCGGCCACC 
ATGCCCCATC 
GCCTGTCTAC 
GCTGAGCCGT 
GGGGAACAGC 
ACCTTATGAG 
TAACAGCAGT 
AACCAGAGAT 
AGGAAGAGAC 
AAAGAACGGT 
ATCCAT CAAG 
TGAGACTTAT 
TCAGCACACA 
GAAACATCTC 
AAAACAATCT 
GAGCCCTATC 
TGTGTGTGTA 
GGCT CAGAGA 
ACTCAAACCT 
GGTGGGTGAG 
GTCAGGTGGG 
GTTTTTCTAA 
AGAAAAGGAG 
CTCAGTCAGA 
GTGTCAAGTG 
TGGAGAGTTC 
ATTTCTTAAT 
TTTTCTTGTA 
AGCAGGTGAA 
CTTACGTAGT 
ATCTGTGATT 
AGCCATATTA 
AGATCCAAGC 
TAGACTACTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
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TTCTTTTTTT TACTCAAAAG TTTAGAGAAT CTCTGTTTCT TTCCATTTTA AAAACATATT 2 640 

TTAAGATAAT AGCATAAAGA CTTTAAAAAT GTTCCTCCCC TCCATCTTCC CACACCCAGT 2 700 

CACCAGCACT GTATTTTCTG TCACCAAGAC AATGATTTCT TGTTATTGAG GCTGTTGCTT 2760 

TTGTGGATGT GTGATT TTAA TTTTCAATAA ACTTTTGCAT CTTGGTTTAA AAGAAA 

Seq ID NO: 22 Protein sequence: 
Protein Accession #: NP_003713 

1 11 21 31 41 51 

1 I 1 I I I 

MSQSTQTNEF LSPEVFQHIW DFLEQPICSV QPIDLNFVDE PSEDGATNKI EISMDCIRMQ 60 

DSDLSDPMWP QYTNLGLLNS MDQQIQNGSS STSPYNTDHA QNSVTAPSPY AQPSSTFDAL 120 

SPSPAIPSNT DYPGPHSFDV SFQQS STARS ATWTYSTELK RLYCQIARTC PIQIKVMTPP 180 

PQGAVIRAMP VYKKAEHVTE WKRCPNHEL SREFNEGQIA PPSHLIRVEG NSHAQYVEDP 240 

ITGRQSVLVP YEPPQVGTEF TTVLYNFMCN SSCVGGMNRR PILIIVTLET RDGQVLGRRC 300 

FEARICACPG RDRKADEDS I RKQQVSDSTK NGDGTKRPFR QNTHGIQMTS IKKRRSPDDE 3 60 

LLYLPVRGRE TYEMLLKIKE SLELMQYLPQ HTIETYRQQQ QQQHQHLLQK HLLSACFRNE 420 
LVEPRRETPK QSDVFFRHSK PPNRSVYP 



Seq ID NO: 2 3 DNA sequence 

Nucleic Acid Accession #: NM_001944.1 

Coding sequence: 84-3083 

1 11 21 31 41 51 

I I I I I I 

TTTTCTTAGA CATTAACTGC AGACGGCTGG CAGGATAGAA GCAGCGGCTC ACTTGGACTT 60 

TTTCACCAGG G AAAT C AG AG ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCTCTGG 120 

CCATCTTCGT GGTGGTCATA TTGGTTCATG GAGAATTGCG AATAGAGACT AAAGGT CAAT 180 

ATGATGAAGA AGAGATGACT AT G CAACAAG CTAAAAGAAG GCAAAAACGT GAATGGGTGA 240 

AATTTGCCAA ACCCTG CAG A GAAGGAGAAG ATAACTCAAA AAGAAACCCA ATTGCCAAGA 300 

TTACTTCAGA TTACCAAGCA ACCCAGAAAA TCACCTACCG AATCTCTGGA GTGGGAATCG 360 

ATCAGCCGCC TTTTGGAATC TTTGTTGTTG ACAAAAACAC TGGAGATATT AACATAACAG 420 

CTATAGTCGA CCGGGAGGAA ACTCCAAGCT TCCTGATCAC ATGTCGGGCT CTAAATGCCC 480 

AAGGACTAGA TGTAGAGAAA CCACTTATAC TAACGGTTAA AATTTTGGAT ATTAATGATA 540 

ATCCTCCAGT ATTTT CACAA CAAATTTTCA TGGGTGAAAT TGAAGAAAAT AGTGCCTCAA 600 

ACTCACTGGT GATGATACTA AATGCCACAG ATGCAGATGA ACCAAACCAC TTGAATTCTA 660 

AAATTGCCTT CAAAATTGTC T CT CAGGAAC CAGCAGGCAC ACCCATGTTC CTCCTAAGCA 720 

GAAACACTGG GGAAGTCCGT ACTTTGACCA ATTCTCTTGA CCGAGAGCAA GCTAGCAGCT 780 

ATCGTCTGGT TGTGAGTGGT GCAGACAAAG ATGGAGAAGG ACTATCAACT CAATGTGAAT 840 

GTAATATTAA AGTGAAAGAT GTCAACGATA ACTTCCCAAT GTTTAGAGAC TCTCAGTATT 900 

CAG CACGT AT TGAAGAAAAT ATTTTAAGTT CTGAATTACT TCGATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTAT ATT T CTTTACCTCT GGGAATGAAG 102 0 

GAAATTGGTT TGAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTGGTGA 10 80 

AGGCTCTAGA TTATGAACAA CTACAAAGCG TGAAACTTAG TATTGCTGTC AAAAACAAAG 1140 

CTGAATTTCA CCAATCAGTT AT CTCT CGAT ACCGAGTTCA GTCAACCCCA GTCACAATTC 12 00 

AGGTAATAAA TGTAAGAGAA GGAATTGCAT TCCGTCCTGC TTCCAAGACA TTTACTGTGC 1260 

AAAAAGG CAT AAGTAGCAAA AAATTGGTGG ATTATATCCT GGGAACATAT CAAGCCATCG 132 0 

ATGAGGACAC TAACAAAGCT GCCTCAAATG TCAAATATGT CATGGGACGT AACGATGGTG 13 80 

GATACCTAAT GATTGATT C A AAAACTGCTG AAATCAAATT TGTCAAAAAT ATGAACCGAG 144 0 

ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACA 150 0 

CGGGTAAAAC TTCTACAGGC ACGGTATATG TTAGAGTACC CGATTTCAAT GACAATTGTC 1560 

CAACAGCTGT CCTCGAAAAA GATGCAGTTT GCAGTTCTTC ACCTTCCGTG GTTGTCTCCG 162 0 

CTAGAACACT GAATAATAGA TACACTGGCC CCTATACATT TGCACTGGAA GATCAACCTG 1680 

TAAAGTTGCC TGCCGTATGG AGT AT CACAA CCCTCAATGC TACCTCGGCC CTCCTCAGAG 1740 

CCCAGGAACA GATACCTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GAC AGT CAG A 180 0 

ACAATCGGTG TGAGATGCCA CGCAGCTTGA CACTGGAAGT CTGTCAGTGT GACAACAGGG 18 60 

GCATCTGTGG AACTTCTTAC CCAACCACAA GCCCTGGGAC CAGGTATGGC AGGCCGCACT 1920 

CAGGGAGGCT GGGGCCTGCC GCCATCGGCC TGCTGCTCCT TGGTCTCCTG CTGCTGCTGT 1980 

TGGCCCCCCT TCTGCTGTTG ACCTGTGACT GTGGGGCAGG TTCTACTGGG GGAGTGACAG 2 040 

GTGGTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT TCATCAGTGG GGAATTGAAG 2100 

GAGCCCATCC TGAAGACAAG G AAAT CACAA ATATTTGTGT GCCTCCTGTA ACAGCCAATG 2160 

GAGCCGATTT CATGGAAAGT TCTGAAGTTT GTACAAATAC GTATGCCAGA GGCACAGCGG 2220 

TGGAAGGCAC TTCAGGAATG GAAATGACCA CTAAGCTTGG AGCAGCCACT GAATCTGGAG 2280 

GTGCTGCAGG CTTTGCAACA GGGACAGTGT CAGGAGCTGC TTCAGGATTC GGAGCAGCCA 2340 

CTGGAGTTGG CATCTGTTCC TCAGGGCAGT CTGGAACCAT GAGAACAAGG CATTCCACTG 2400 

GAGGAACCAA TAAGGACTAC GCTGATGGGG CGATAAGCAT GAATTTTCTG GACTCCTACT 2460 

TTTCTCAGAA AGCATTTGCC TGTGCGGAGG AAGACGATGG CCAGGAAGCA AATGACTGCT 2520 

TGTTGATCTA TGATAATGAA GGCGCAGATG CCACTGGTTC TCCTGTGGGC TCCGTGGGTT 2580 

GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT GGACTCACTT GGACCCAAAT 2 640 

TTAAAAAACT TGCAGAGATA AGCCTTGGTG TTGATGGTGA AGGCAAAGAA GTTCAGCCAC 2 700 

CCTCTAAAGA CAGCGGTTAT GGGATTGAAT CCTGTGGCCA TCCCATAGAA GTCCAGCAGA 2760 

CAGGATTTGT TAAGTGCCAG ACTTTGTCAG GAAGT CAAGG AGCTTCTGCT TTGTCCGCCT 2 82 0 

CTGGGTCTGT CCAGCCAGCT GTTTCCATCC CTGACCCTCT GCAGCATGGT AACTATTTAG 2880 

TAACGGAGAC TTACTCGGCT TCTGGTTCCC TCGTGCAACC TTCCACTGCA GGCTTTGATC 2 940 

CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGGTGAT CTGTCCCATT TCCAGTGTTC 3 000 

CTGGCAACCT AGCTGGCCCA ACGCAGCTAC GAGGGTCACA TACTATGCTC TGTACAGAGG 3 060 

ATCCTTGCTC CCGTCTAATA TGACCAGAAT GAGCTGGAAT ACCACACTGA CCAAATCTGG 3120 

ATCTTTGGAC TAAAGTAT T C AAAATAGCAT AGCAAAGCTC ACTGTATTGG GCTAATAATT 318 0 

TGGCACTTAT TAGCTTCTCT CATAAACTGA TCACGATTAT AAATTAAATG TTTGGGTTCA 3240 

TACCCCAAAA GCAATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTGTAGTAAA 3300 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT ATTCGC 

Seq ID NO: 24 Protein sequence: 
Protein Accession #: NP_0 01935.1 

1 11 21 31 41 51 
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WO 02/086443 

I I " 

MMGLFPRTTG ALAIFVWIL 
GEDNSKRNPI AKITSDYQAT 
PSFLITCRAL NAQGLDVEKP 
ATDADEPNHL NSKIAFKIVS 
DKDGEGLSTQ CECNIKVKDV 
WLAVYFFTSG NEGNWFEIQT 
SRYRVQSTPV TIQVINVREG 
SNVKYVMGRN DGGYLMIDSK 
VYVRVPD FND NCPTAVLEKD 
I TTLNATSAL LRAQEQIPPG 
TTSPGTRYGR PHSGRLGPAA 
GSEGTIHQWG IEGAHPEDKE 
MTTKLGAATE SGGAAGFATG 
DGAISMNFLD SYFSQKAFAC 
LDDSFLDSLG PKFKKLAEIS 
LSGSQGASAL SASGSVQPAV 
VTERVICPIS SVPGNLAGPT 



PCT/US02/12476 



VHGELRIETK 
QKITYRISGV 
LILTVKILDI 
QEPAGTPMFli 
NDNFPMFRDS 
DPRTNEG ILK 
IAFRPASKTF 
TAEIKFVKNM 
AVCSSSPSW 
VYHISLVLTD 
IGLLLLGLLL 
ITNICVPPVT 
TVSGAASGFG 
AEEDD GQEAN 
LGVDGEGKEV 
SIPDPLQHGN 
QLRGSHTMLC 



i 

GQYDEEEMTM 
GIDQPPFGIF 
NDNPPVFSQQ 
LSRNTGEVRT 
QYSARIEENI 
WKALDYEQL 
TVQKGISSKK 
NRDSTFIVNK 
VSARTLNNRY 
SQNNRCEMPR 
LLLAPLLLLT 
ANGADFMESS 
AATGVGICSS 
DCLLIYDNEG 
QPPSKDSGYG 
YLVTETYSAS 
TEDPCSRLI 



QQAKRRQKRE 
WDKNTGDIN 
IFMGEIEENS 
LTNSLDREQA 
LSSELLRFQV 
QSVKLSIAVK 
LVDYILGTYQ 
TITAEVLAID 
TGPYTFALED 
SLTLEVCQCD 
CDCGAGSTGG 
EVCTNTYARG 
GQSGTMRTRH 
ADATGSPVGS 
IESCGHPIEV 
GSLVQPSTAG 



Seq ID NO: 25 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 56-1642 



AG TA TCCCAG 
GCAAGGGATC 
CATGTTTGAG 
CTGCT CTGT C 
GGAGAAGGTG 
GGAAGATCAG 
GGACTCTTTT 
CTTTTCCCAG 
GGAGATGGTA 
TAACTCAGGG 
GTCCCTGGCG 
GCCCTTGCTC 
GAAGAAGCTG 
GAG G AGTGT C 
TGGGCTCTCT 
ATGGGCACAG 
GATCT CATTC 
ACAGCGCAAG 
AGATCTCAAC 
TCGTAAGAAC 
CATCTTCTCA 
CGAGCTGTCA 
ACGGTTGAAG 
TGCCCTTCGT 
CAAGTTGACT 
CAATGTGAAT 
CATTGCTAGC 
ACTCGTTCAT 
CAGACACAGG 
AGGAGCTCCT 
AGCTACAGCT 
AACAGCGGGA 
TGTATGAAGA 
TTCAGGAGCG 
AGTCAGTGGC 
CAGCTTCTGC 
CAGAGCTAAA 
CCTCAGCCAA 
TAAGGCTGTT 
CTTGTTGCCA 
TCTTAATCAA 
TGGACCTTCG 
AAGGCCAGGT 
AACAACCACC 
AAAGCTCAAC 
TCAAATCTGG 
GGCCCTGAGG 
TAT CAGGAAT 
TATAACCACC 
GCACACAAAA 
GTAGCAAAAT 



11 
I 

GAGGAGCAAG 
CTTTCTCCGC 
TCCACAGCTG 
GTCTCTACCT 
AAAGTATACT 
GGTTGTGTCC 
GCCCTGAAGA 
ATCTTTGGGC 
AAGGATGTAC 
AAAACCCACA 
CTGATCTTCA 
TCCAATGAGG 
TCCCTGCTAA 
TACAT CGAAA 
TCTATCAGTC 
CCAGACACTG 
TTTGAGATCT 
AGGCAGACTT 
TGGATT CATG 
CAGAGCTTTG 
AT CAGGATCC 
CTCTGTGATC 
GAAGCAGGAA 
CAAAACCAGC 
CGAGTGTTCC 
CCCTGTGCAT 
CAGGTGACTT 
CAAGGAACAT 
CCTTGATGAT 
ACAAGTTGTG 
GGAGATGCAT 
ACAGTGGTGC 
AAAACTAAAT 
GGATGAAAAG 
CCATCAGCAA 
CTCCACCCAG 
CTCTACCACT 
GCCCTTCACC 
GCGGACAGAG 
CAGCACTGGG 
ACAGGACCAG 
GAAGAAGGCA 
TTCTGCCAAA 
AGGGAAGAAA 
AGACTGCAGC 
GCCTTTTGGC 
TGGGTCAGCT 
TATATCCAGG 
TATGTAATCT 
ACAGTTATAT 
CATTAAAACA 



21 
1 

TGGCACGTCT 
CAGCGGGCTT 
CAGATTTGGG 
CCCTAGAGGA 
TGAGGGTTAG 
GTATTGAGAA 
GCAATGAACG 
CAGAAGTGGG 
TCAAAGGGCA 
CGATTCAAGG 
ATAGCCTCCA 
TAATCTGGCT 
ATGGAGGCCT 
GTCGGATAGG 
AGTGTACCAG 
CCCCACTACC 
ACAACGAACT 
TGCGGCTATG 
TGCAAGATGC 
CCAGCACCCA 
TACACCTTCA 
TGGCTGGCTC 
ACATTAACAC 
AGAACCGGTC 
AAGGTTTCTT 
CTACCTATGA 
GTGCATGCCC 
AGTCTTCAGG 
GATATTGAAA 
GAAGCCATGA 
CTCCGAGATG 
AGTGAACATT 
ATCCTCAAGG 
ATTGAAGAGC 
TCAGGGTCTG 
CAGCTTCAGG 
GAAGAGTTGC 
ATTGATGTGG 
CTTCAGAAAC 
GCAGGAAAAC 
ACTCTGGCTG 
GCATGTATTG 
AAGCGCCTTG 
CCATTCCTTC 
CCTTATGCCC 
AAAAAGTACT 
ACTCTCCTGA 
ATGCAATACT 
CATGTTGTTG 
TAAAGATATT 
AATTATAAAA 



31 
I 

TCGGACCTAG 
GCTGTCCGAT 
GT CTGTGGT A 
CAAGCAGCAG 
GCCCTTGTTA 
TGTGGAGACC 
GGGAATTGGC 
ACAGGCATCC 
GAACTGGCTC 
TACCATCAAG 
AGGCCAACTT 
AGACAGCAAG 
CCAAGAGGAG 
TACCAGCACC 
CAGTAGCCAG 
TGTCCCGGCA 
GCTTTATGAC 
CGAGGATCAA 
TGAGGAGGCC 
CCTCAACCAG 
GGGGGAAGGA 
AGAGCGCTGC 
CTCTCTACAC 
AAAGCAGAAC 
CACAGGCCGA 
TGAAACTCTT 
CACCTATGCA 
TATCCCCCAG 
ATGAAGCTGA 
AGACACTGCT 
AAATTTGCAA 
TGGACACCCA 
AGTCACTGAC 
TAG AAGCT CT 
AATTGGCCCT 
AGGTTAAAGC 
ATAAGTATCA 
ACAAGAAGTT 
TTGGTGAGTC 
TTCGTCAAGC 
AACTGCAGAA 
CTGAGCAGTA 
GTACCAACCA 
GAAATTTACT 
GG AT CCTACG 
AAGGCTGTGG 
AGAAATAGGT 
CAGACACTAG 
TTTTTTTTTA 
ATTGTTCACA 
GGGACAGAAA 



41 
I 

GCTGCCCCTG 
GACGATGTCG 
CGCAAGAACC 
GTTCCATCTG 
CCTTCAGAGT 
CTTGTTCTAC 
CAAGCCACAC 
TTCTTCAACC 
ATCTATACAT 
GATGGAGGGA 
CATCCAACAC 
CAGAT CCGAC 
GAGCTGTCCA 
AGCTTCGACA 
CTGGATGAAA 
AACATTCGCT 
CTATTAGAAC 
AATGGCAATC 
TGGAAGCTCC 
AACTCCAGCC 
GATATAGTCC 
AAAGATCAGA 
ACCCTGGGCC 
CTGGTTCCCT 
GGCCGTTCCT 
CATGTGGCCA 
ACTGGGATTC 
CTTAGAGAAA 
CATCTCCATG 
TTTGAAGGAA 
TGAGATGGTA 
AAAGGAACTA 
AAGTTTTTAC 
CTTGCAGGAA 
ACGGCGGTCA 
TAAATTACAG 
GAAAATGTTA 
AGAAGAGGGC 
TCTCCAATCA 
CTTGACCACT 
CAACATGGTG 
TCATACTGTG 
GGAAAATCAG 
TCCCCGAACA 
CTCACGGCGT 
GGAAAGAGAA 
CTCTTTTATG 
CTTTTTTCTC 
TTTACTTATA 
TTTTTTATTG 
AA 



Seq ID NO: 26 Protein sequence: 
Protein Accession #: Eos sequence 



MSQGILSPPA 
SMEKVKVYLR 
FTFSQIFGPE 
PRSLALIFNS 
LKRSVYIESR 



11 
I 

GLLSDDDWV 
VRPLLPSELE 
VGQASFFNLT 
LQGQLHPTPD 
IGTSTSFDSG 



21 
I 

SPMFESTAAD 
RQEDQGCVRI 
VKEMVKDVLK 
LKPLLSNEVI 
IAGLSSISQC 



31 
I 

LGSWRKNLL 
ENVETIiVLQA 
GQNWLIYTYG 
WLDSKQIRQE 
TSSSQLDETS 



41 
i 

SDCSWSTSL 
PKDSFALKSN 
VTNSGKTHTI 
EMKKLSLLNG 
HRWAQPDTAP 



WVKFAKPCRE 
ITAIVDREET 
ASNSLVMILN 
SSYRLWSGA 
TDLDEEYTDN 
NKAEFHQSVI 
AIDEDTNKAA 
EYTGKTSTGT 
QPVKLPAVWS 
NRGICGTSYP 
VTGGFIPVPD 
TAVEGTSGME 
STGGTNKDYA 
VGCCSFIADD 
QQTGFVKCQT 
FDPLLTQNVI 



51 

I 

CCGTCATGTC 
TAGTTTCTCC 
TGCTATCAGA 
AGGACAGTAT 
TGGAACGACA 
AAGCACCCAA 
ACAGGTTCAC 
TAACTGTGAA 
ATGGAGT CAC 
TTCTCCCCCG 
CTGAT CTGAA 
AGGAGGAAAT 
CTTCCTTGAA 
GTGGCATTGC 
CAAGT CAT CG 
TCTCCATCTG 
CGCCTAGCCA 
CCTATGTGAA 
TAAAAGTGGG 
GCAGT CACAG 
CCAAGATCAG 
AGAGTGGTGA 
GCTGTATTGC 
TCCGTGACAG 
GCATGATTGT 
AGTTCTCAGC 
CCATCCCTGC 
GGGGCTAAGG 
TATGG CAAAG 
CGACAGGAAA 
GAACAGATGC 
TTGGAGGAAA 
CAAGAAGAGA 
GCCAGACAAC 
CAAAGGTTGG 
CAGTGCAAAG 
GAACCACCAC 
CAGAAGAATA 
GCAGAGAGAG 
TGTGATGACA 
CTAGTGAAAC 
TTGAAACTCC 
CAACCAAACC 
CCAACCTGCC 
TCCCCTTTAC 
GAGCAGTCAT 
CTTTACCATA 
ACTTTTGTAT 
TGATTTCTAT 
AATTCCAAAT 



51 
I 

EDKQQVPSED 
ERGIGQATHR 
QGTIKDGGIL 
GLQEEELSTS 
LPVPANIRFS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



60 
120 
180 
240 
300 



197 



10 



15 



20 



25 



30 



35 



40 



WO 02/086443 

IWISFFEIYN ELLYDLIiEPP SQQRKRQTLR LCEDQNGNPY VKDLNWIHVQ DAEEAWKLLK 
VGRKNQSFAS THLNQNSSRS HSIFSIRILH LQGEGDIVPK ISELSLCDLA GSERCKDQKS 
GERLKE AGN I NTSLHTLGRC IAALRQNQQN RSKQNLVPFR DSKLTRVFQG FFTGRGRSCM 
IVNVNPCAST YDETLHVAKF SAIASQVTCA CPTYATGIPI PALVHQGT 

Seq ID NO: 27 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 13-1424 



PCT/US02/12476 



360 
420 
480 



TAGAAGTTTA 
CTTCCCCTGA 
TTAGAAAAAT 
GGAAACTTAA 
GGGCAACTGG 
GTCCAT CAT T 
AGAATCAATA 
GCTTTCCAAG 
GCTGACATTT 
AAAGGTGGAA 
TTCGATGAGG 
GTTCACGAGA 
TTCCCCACCT 
GGCATTCAGT 
TCAGAACCAG 
AAGATCTTTT 
AGTGTTAATT 
GAAATTGAAG 
AATTTAAGAC 
GTGAAAAAAA 
GATAACCAGT 
CTGATTACCA 
AACAAATACT 
CGTATCACCA 
TGGTTTTTGT 
GTGTACCACT 
TTATATAAAA 
CTCTACTATT 
CTCTGTAAGT 
TAAAATTAAG 



11 
I 

CAATGAAGTT 
ACAGCTCTAC 
TTTATGGCCT 
TGAAGGAAAA 
ACACATCTAC 
TCAGGGAAAT 
ATTACACACC 
TATGGAGTAA 
TGGTGGTTTT 
TCCTAGCCCA 
ACGAATTCTG 
TTGGCCATTC 
ACAAATATGT 
CCCTGTATGG 
CTCTCTGTGA 
TCTTCAAAGA 
TAATTTCTTC 
CCAGAAATCA 
CAGAGCCAAA 
TTGATGCAGC 
ATTGGAGGTA 
AGAACTTCCA 
ACTATTTCTT 
AAACACTGAA 
TAGTTCACTT 
ACTTAGAGAT 
TACATAATAT 
AAGTTTGAAA 
TGCTTCCTAA 
TATATATATT 



21 
I 

TCTTCTAATA 
AAGCCTGGAA 
TGAGATAAAC 
AAT CCAAGAA 
CCTGGAGATG 
GCCAGGGGGG 
TGACATGAAC 
TGTTACCCCC 
TGCCCGTGGA 
TGCTTTTGGA 
GACTACACAT 
CTTAGGTCTT 
TGACATCAAC 
AGACCCAAAA 
CCCCAATTTG 
CAGGTTCTTC 
CTTATGGCCA 
AGTTTTTCTT 
TTATCCCAAG 
TGTTTTTAAC 
TGATGAAAGG 
AGGAATCGGG 
CCAAGGATCT 
AAGCAATAGC 
CAGCTTAATA 
ATGTATCATA 
TTTTCAATTT 
ATAGTTACCT 
CATCCTTGGA 
TTGGCTCAAA 



31 

1 

CTGCTCCTGC 
AAAAATAATG 
AAACTTCCAG 
ATGCAGCACT 
ATGCACGCAC 
CCCGTATGGA 
CGTGAGGATG 
TTGAAATTCA 
GCT CATGGAG 
CCTGGATCTG 
TCAGGAGGCA 
GGCCATTCTA 
ACATTTCGCC 
GAGAACCAAC 
AGTTTTGATG 
TGGCTGAAGG 
ACCTTGCCAT 
TTTAAAGATG 
AGCATACATT 
CCACGTTTTT 
AGACAGATGA 
CCTAAAATTG 
AACCAATTTG 
TGGTTTGGTT 
AGTATTTATT 
AAAATAAAAT 
TGAAAACTCT 
TCAAAGCAAG 
CTGAGAAATT 
TAAAATTG 



41 

1 

AGGCCACTGC 
TGCTATTTGG 
TGACAAAAAT 
TCTTGGGTCT 
CTCGATGTGG 
GGAAACATTA 
TTGACTACGC 
GCAAGATTAA 
ACTTCCATGC 
GCATTGGAGG 
CAAACTTGTT 
GTGATCCAAA 
TCTCTGCTGA 
GCTTGCCAAA 
CTGTCACTAC 
TTTCTGAGAG 
CTGGCATTGA 
ACAAATACTG 
CTTTTGGTTT 
ATAGGACCTA 
TGGACCCTGG 
ATGCAGTCTT 
AATATGACTT 
GTTGAAAATG 
GCATATTTGC 
CTGTAAACCA 
AATTGTCCAT 
ATAATTCTAT 
ATACTTACTT 



51 

I 

TTCTGGAGCT 
TGAAAGATAC 
GAAATATAGT 
GAAAGTGACC 
AGT CCCCGAT 
TATCACCTAC 
AATCCGGAAA 
CACAGGCATG 
TTTTGATGGC 
GGATGCACAT 
CCTCACTGCT 
GGCCGTAATG 
TGACATACGT 
TCCTGACAAT 
CGTGGGAAAT 
ACCAAAGACC 
AGCTGCTTAT 
GTTAATTAGC 
TCCTAACTTT 
CTTCTTTGTA 
TTATCCCAAA 
CTACTCTAAA 
CCTACTCCAA 
GTGTAATTAA 
TATGTCCTCA 
TAGGTAATGA 
TCTTGCTTGA 
TTGAAGCATG 
CTGGCATAAC 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 2 8 Protein sequence: 
Protein Accession #: Eos sequence 



1 
I 

MKFLLILLLQ 
KEKIQEMQHF 
YTPDMNREDV 
LAHAFGPGSG 
KYVDINTFRL 
FKDRFFWLKV 
EPNYPKSIHS 
NFQGIGPKID 



11 
i 

ATASGALPLN 
LGLKVTGQLD 
DYAIRKAFQV 
I GGD AHFDED 
SADDIRGIQS 
SERPKTSVNL 
FGFPNFVKKI 
AVFYSKNKYY 



21 
1 

SSTSLEKNNV 
TSTLEMMHAP 
WSNVTPLKFS 
EFWTTHSGGT 
LYGDPKENQR 
ISShVJPTLPS 
DAAVFNPRFY 
YFFQGSNQFE 



31 

I 

LFGERYLEKF 
RCGVPDVHHF 
KINTGMADIL 
NLFLTAVHEI 
LPNPDNSEPA 
GIEAAYEIEA 
RTYFFVDNQY 
YDFLLQRITK 



41 

I 

YGLEINKLPV 
REMPGGPVWR 
WFARGAHGD 
GHSLGLGHSS 
LCDPNLSFDA 
RNQVFLFKDD 
WRYDERRQMM 
TLKSNSWFGC 



Seq ID NO: 29 DNA sequence 

Nucleic Acid Accession #: NM_006115.1 

Coding sequence: 236.. 1765 



GCTTCAGGGT 
CGGGACACCC 
ACTCTCTGAG 
GAGACCTAGA 
ACGAAGGCGT 
CCCACGGAGA 
TGCCGCCCTG 
CGGGAGACAC 
TCTGGGAGTG 
TG G ACT TG AT 
GGATTTACGG 
TCTGTACTCA 
TGGTTTGAGC 
CCTCAAGGAA 
GAAAAATGTA 
TAT C AAGATG 
TACCTGGAAG 
GCGTAGACTC 
G CAGTATATC 
TGTGGACTCT 
CCCCTTGGAA 
GTCCCAGAGT 
OGATGTAAGT 
CCTGGTCTTT 
GAGCCACTGC 



11 

I 

ACAGCTCCCC 
CACCCGCTTC 
GAAAAACCAT 
AATCCAAGCG 
TTGTGGGGTT 
CTTGTGGAGC 
GAGTTGCTGC 
AGCCAGACCC 
CTGATGAAGG 
GTGCTCCTTG 
AAGAACTCTC 
TTTCCAGAGC 
ACAGAGGCAG 
GGTGCCTGTG 
CTACGCCTGT 
ATCCTGAAAA 
CTACCCACCT 
CTCCTCTCCC 
GCCCAGTTCA 
TTATTTTTCC 
ACCCTCTCAA 
CCCAGCGTCA 
CCCGAGCCCC 
GATGAGTGTG 
TCCCAGCTTA 



21 
I 

CGCAGCCAGA 
CCAGGCGTGA 
TTTGATTATT 
TTGGAGGTCC 
CC ATT CAG AG 
TGGCAGGGCA 
CCAGGGAGCT 
TGAAGGCAAT 
GACAACATCT 
CCCAGGAGGT 
AT CAGGACTT 
CAGAAGCAGC 
AGCAGCCCTT 
ATGAATTGTT 
GCTGTAAGAA 
TGGTGCAGCT 
TGGCGAAATT 
ACATCCATGC 
CCTCTCAGTT 
TTAGAGGCCG 
TAACTAACTG 
GTCAGCTAAG 
TCCAAGCTCT 
GGAT CACGGA 
CAACCTTAAG 



31 

I 

AGCCGGGCCT 
CCTGTCAACA 
ACTCTCAGAC 
TGAGGCCAGC 
CCGATACATC 
GAGCCTGCTG 
CTTCCCGCCA 
GGTGCAGGCC 
TCACCTGGAG 
TCGCCCCAGG 
CTGGACTGTA 
TCAGCCCATG 
CATTCCAGTA 
CTCCTACCTC 
GCTGAAGATT 
GGACTCTATT 
TTCTCCTTAC 
ATCTTCCTAC 
CCTCAGTCTG 
CCTGGATCAG 
CCGGCTTTCG 
TGTCCTGAGT 
GCTGGAGAGA 
TGAT CAGCTC 
CTTCTACGGG 



41 
I 

GCAGCCCCTC 
GCAACTTCGC 
GTGCGTGGCA 
CTAAGTCGCT 
AGCATGAGTG 
AAGGATGAGG 
CTCTTCATGG 
TGGCCCTTCA 
ACCTTCAAAG 
AGGTGGAAAC 
TGGTCTGGAA 
ACAAAGAAGC 
GAGGTGCTCG 
AT TGAGAAAG 
TTTGCAATGC 
GAAGATTTGG 
CTGGGCCAGA 
ATTTCCCCGG 
CAGTGCCTGC 
TTGCTCAGGC 
GAAGGGGATG 
CTAAGTGGGG 
GCCTCTGCCA 
CTTGCCCTCC 
AATTCCATCT 



51 
I 

TKMKYSGNLM 
KHYITYRINN 
FHAFDGKGGI 
DPKAVMFPTY 
VTTVGNK IFF 
KYWLISNLRP 
DPGYPKLITK 



51 

I 

AGCACCGCTC 
GGTGTGGTGA 
ACAAGTGACT 
TCAAAATGGA 
TGTGGACAAG 
CCCTGGCCAT 
CAGCCTTTGA 
CCTGCCTCCC 
CTGTGCTTGA 
TTCAAGTGCT 
ACAGGGCCAG 
GAAAAGTAGA 
TAGACCTGTT 
TGAAGCGAAA 
CCATGCAGGA 
AAGTGACTTG 
TGATTAATCT 
AGAAGGAAGA 
AGGCTCTCTA 
ACGTGATGAA 
TGATGCATCT 
TCATGCTGAC 
CCCTCCAGGA 
TGCCTTCCCT 
CCATATCTGC 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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CTTGCAGAGT CTCCTGCAGC ACCT CATCGG GCTGAGCAAT CTGACCCACG 
TGTCCCCCTG GAGAGTTATG AGGACATCCA TGGTACCCTC CACCTGGAGA 
TCTGCATGCC AGGCTCAGGG AGTTGCTGTG TGAGTTGGGG CGGCCCAGCA 
TAGTGCCAAC CCCTGTCCTC ACTGTGGGGA CAGAACCTTC TATGACCCGG 
GTGCCCCTGT TTCATGCCTA ACTAGCTGGG TGCACATATC AAATGCTTCA 
TTGGACACTA AAGCCAGGAT GTGCATGCAT CTTGAAGCAA CAAAGCAGCC 
ACAAATGTTC AGTGTGAGTG AGGAAAACAT GTT CAGTGAG GAAAAAACAT 
GTT CAGTGAG GAAAAAAAGG GGAAGTTGGG GATAGGCAGA TGTTGACTTG 
GTGATCTTTG GGGAGATACA TCTTATAGAG TTAGAAATAG AATCTGAATT 
GATTCTGGCT TGGGAAGTAC ATGTAGGAGT TAATCCCTGT GTAGACTGTT 
TGTTGAAAAT AAAGAGAAGC AATGTGAAGC AAAAAAAAAA AAAAAAAA 

Seq ID NO: 30 Protein sequence: 
Protein Accession #: NP_006106.1 



PCT/US02/12476 



GCTTCAGGGT 
CGGGACACCC 
ACTCTCTGAG 
GAGACCTAGA 
ACGAAGGCGT 
CCCACGGAGA 
TGCCGCCCTG 
CGGGAGACAC 
TCTGGGAGTG 
TGGACTTGAT 
GGATTTACGG 
TCTGTACTCA 
TGGTTTGAGC 
CCT CAAGGAA 
GAAAAATGTA 
TAT CAAGATG 
TACCTGGAAG 
GCGTAGACTC 
GCAGTATATC 
TGTGGACTCT 
CCCCTTGGAA 
GTCCCAGAGT 
CGATGTAAGT 
CCTGGTCTTT 
GAGCCACTGC 
CTTGCAGAGT 
TGTCCCCCTG 
TCTGCATGCC 
TAGTGCCAAC 
GTGCCCCTGT 
TTGGACACTA 
ACAAATGTTC 
GTT CAGTGAG 
GTGATCTTTG 
GATTCTGGCT 
TGTTGAAAAT 



11 
I 

ACAGCTCCCC 
CACCCGCTTC 
GAAAAACCAT 
AATCCAAGCG 
TTGTGGGGTT 
CTTGTGGAGC 
GAGTTGCTGC 
AGCCAGACCC 
CTGATGAAGG 
GTGCTCCTTG 
AAGAACTCTC 
TTTCCAGAGC 
ACAGAGGCAG 
GGTGCCTGTG 
CTACGCCTGT 
ATCCTGAAAA 
CTACCCACCT 
CTCCTCTCCC 
GCCCAGTTCA 
TTATTTTTCC 
ACCCTCTCAA 
CCCAGCGTCA 
CCCGAGCCCC 
GATGAGTGTG 
TCCCAGCTTA 
CTCCTGCAGC 
GAGAGTTATG 
AGGCTCAGGG 
CCCTGTCCTC 
TTCATGCCTA 
AAGCCAGGAT 
AGTGTGAGTG 
GAAAAAAAGG 
GGGAGATACA 
TGGGAAGTAC 
AAAGAGAAGC 



21 
I 

CGCAGCCAGA 
CCAGGCGTGA 
TTTGATTATT 
TTGGAGGTCC 
CCATT CAG AG 
TGGCAGGGCA 
CCAGGGAGCT 
TGAAGGCAAT 
GACAACATCT 
CCCAGGAGGT 
AT CAGGACTT 
CAGAAGCAGC 
AGCAGCCCTT 
ATGAATTGTT 
GCTGTAAGAA 
TGGTGCAGCT 
TGGCGAAATT 
ACATCCATGC 
CCTCTCAGTT 
TTAGAGGCCG 
TAACTAACTG 
GTCAGCTAAG 
TCCAAGCTCT 
GGAT CACGG A 
CAACCTTAAG 
ACCTCATCGG 
AGGACATCCA 
AGTTGCTGTG 
ACTGTGGGGA 
ACTAGCTGGG 
GTGCATGCAT 
AGGAAAACAT 
GGAAGTTGGG 
TCTTATAGAG 
ATGTAGGAGT 
AATGTGAAGC 



31 
I 

AGCCGGGCCT 
CCTGTCAACA 
ACTCTCAGAC 
TGAGGCCAGC 
CCGATACATC 
GAGCCTGCTG 
CTTCCCGCCA 
GGTGCAGGCC 
TCACCTGGAG 
TCGCCCCAGG 
CTGGACTGTA 
TCAGCCCATG 
CATTCCAGTA 
CTCCTACCTC 
GCTGAAGATT 
GGACT CTATT 
TTCTCCTTAC 
ATCTTCCTAC 
CCTCAGTCTG 
CCTGGATCAG 
CCGGCTTTCG 
TGTCCTGAGT 
GCTGGAGAGA 
TGATCAGCTC 
CTTCTACGGG 
GCTGAGCAAT 
TGGTACCCTC 
TGAGTTGGGG 
CAGAACCTTC 
TGCACATATC 
CTTGAAGCAA 
GTT CAGTGAG 
GATAGGCAGA 
TTAGAAATAG 
TAATCCCTGT 
AAAAAAAAAA 



Seq ID NO: 31 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 64-2754 



GGCAGGTCTC 
CCGATGGCCG 
CTGACCCTCG 
CCTTCTAAAC 
TCTGCAGACC 
TACACAGCCA 
GACAAAAGGA 
TCGAAGACAA 
ATTCCTTGCT 
GAAT CTGATG 
AAAGAACCTT 
CCTGTGGATC 
GGATATTCAG 
CACCCTGTTT 
ACTACAGTGG 
CTGAAATACA 
AGCACAGGCG 
TCATTGATAA 
ACTTGTATCA 
TATGAAGCAT 
GATAAGGATT 
GAAAATGGAC 
GTAAAGCCAC 
GAAGCGCCAT 
GTTCATGTGA 
ATTAAAGAAA 



11 
I 

GCTCTCGGCA 
CCGCTGGGCC 
TGAT CTTCAG 
TAGAGGCAGA 
TCATCCGGTC 
GGGCTGTTGC 
AACAGACACA 
GACACACTAG 
CTATGCAAGA 
CAGCACAGAA 
TAAATTTGTT 
GTGAAGAATA 
CAGATCTGCC 
TCACAGAAGC 
GGGTGGTTTG 
GCATTTTGCA 
TAAT CACCAC 
TGAAAGTACA 
TAACAGTAAC 
TTGTAGAGGA 
TAATTAACAC 
ATTTCAAAAT 
TGAATTATGA 
TTGCTAGAGA 
GGGATCTGGA 
ACTTAGCAGT 



21 

1 

CCCTCCCGGC 
CCGGCGCTCC 
TCGTGATGGT 
CAAAATAATT 
AAGTGATCCT 
GCTGTCTGAT 
GAAAGAGGTT 
AGAAA CTGT T 
GAATTCCTTG 
CTATACTGTC 
TTATATAGAA 
TGATGTTTTT 
CCTCCCACTA 
AATTTATAAT 
TGCCACAGAC 
GCAGACACCA 
AGTCTCTCAT 
AGACATGGAT 
AGATTCAAAT 
AAATGCATTC 
TGCCAATTGG 
CAGCACAGAC 
AGAAAACCGT 
TATTCCCAGA 
TGAGGGGCCT 
GGGGTCAAAG 



31 
I 

GCCCGCGTTC 
GTGCGCGGAG 
GAAGCCTGCA 
GGCAGAGTTA 
GATTTCAGAG 
AAGAAAAGAT 
ACTGTGCTGC 
CTCAGGCGTG 
GGCCCTTTCC 
TTCTACTCAA 
AGAGACACTG 
GATTTGATTG 
CCCAT CAGGG 
TTTGAAGTTT 
AGAGATGAAC 
AGGTCACCTG 
TATTTGGACA 
GGCCAGTTTT 
GATAATGCAC 
AATGTGGAAA 
AGAGTCAATT 
AAAGAAACTA 
CAAGTGAACC 
GTGACAGCCT 
GAATGCACTC 
AT CAACGGCT 



41 
I 

GCAGCGCCTC 
GCAACTTCGC 
GTGCGTGGCA 
CTAAGTCGCT 
AGCATGAGTG 
AAGGATGAGG 
CTCTTCATGG 
TGGCCCTTCA 
ACCTTCAAAG 
AGGTGGAAAC 
TGGT CTGGAA 
ACAAAGAAGC 
GAGGTGCTCG 
ATTGAGAAAG 
TTTGCAATGC 
GAAGATTTGG 
CTGGGCCAGA 
ATTTCCCCGG 
CAGTGCCTGC 
TTGCTCAGGC 
GAAGGGGATG 
CTAAGTGGGG 
GCCTCTGCCA 
CTTGCCCTCC 
AATTCCATCT 
CTGACCCACG 
CACCTGGAGA 
CGGCCCAGCA 
TATGACCCGG 
AAATGCTTCA 
CAAAGCAGCC 
GAAAAAACAT 
TGTTGACTTG 
AATCTGAATT 
GTAGACTGTT 
AAAAAAAA 



41 

] 

TCCTGGCCCT 
CCGTCTGCCT 
AAAAGGTGAT 
ATTTGGAAGA 
TTCTAAATGA 
CATTTACCAT 
TAGAACATCA 
CCAAGAGGAG 
CATTGTTTCT 
TAAGTGGACG 
GAAAT CTATT 
CTTATGCGTC 
TAGAGGATGA 
TGGAAAGTAG 
CGGACACAAT 
GGCTCTTTTC 
GAGAGGTTGT 
TTGGATTGAT 
CCACTTTCAG 
TCTTACGAAT 
TTACCATTTT 
ATGAAGGTGT 
TGGAAATTGG 
TGAACAGAGC 
CTGCAGCCCA 
ATAAGGCATA 



TGCTGTATCC 
GGCTTGCCTA 
TGGTCTGGCT 
AGCCCATCCT 
TTCTGCATAC 
ACAGTTTCAG 
TCAGACAAAT 
AGGAGTTAAT 
TCTAAAGGGA 
GTAAAGAAAC 



51 
I 

AGCACCGCTC 
GGTGTGGTGA 
ACAAGTGACT 
TCAAAATGGA 
TGTGGACAAG 
CCCTGGCCAT 
CAGCCTTTGA 
CCTGCCTCCC 
CTGTGCTTGA 
TTCAAGTGCT 
ACAGGGCCAG 
GAAAAGTAGA 
TAGACCTGTT 
TGAAGCGAAA 
CCATGCAGGA 
AAGTGACTTG 
TGATTAATCT 
AGAAGGAAGA 
AGGCTCTCTA 
ACGTGATGAA 
TGATGCATCT 
TCATGCTGAC 
CCCTCCAGGA 
TGCCTTCCCT 
CCATAT CTGC 
TGCTGTATCC 
GGCTTGCCTA 
TGGTCTGGCT 
AGCCCATCCT 
TTCTGCATAC 
ACAGTTTCAG 
TCAGACAAAT 
AGGAGTTAAT 
TCTAAAGGGA 
GTAAAGAAAC 



1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



51 

I 

GCCCGGCATC 
GCATCTGCTG 
ACTTAATGTA 
GTGCTTCAGG 
TGGGTCAGTG 
ATGGCTTTCT 
GAAGAAGGTA 
ATGGGCACCT 
TCAACAAGTT 
TGGAGTTGAT 
TTGCACTCGG 
AACTGCAGAT 
AAATGACAAC 
TAGACCTGGT 
GCATACGCGC 
TGTGCATCCC 
AGACAAGTAC 
AGGCACATCA 
ACAAAATGCT 
ACCTATAGAA 
AAAGGGAAAT 
TCTTTCTGTT 
AGTAAACAAT 
CTTGGTTACA 
ATATGTGCGG 
TGACCCCGAA 



60 
120 
180 
240 
300 
360 
420' 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACCG ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 192 0 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2 040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 

ACTT CAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTAT AT C AAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCAT CAT 2460 

ACCCTGGACT CCTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAAAAT TGCATCGATG TAATCAGAAT 2580 

GAAGACCGCA TGCCATCCCA AGATTATGTC CTCACTTATA ACTATGAGGG AAGAGGATCT 2640 

CCAGCTGGTT CTGTGGGCTG CTGCAGTGAA AAG CAGGAAG AAGATGGCCT TGACTTTTTA 2700 

AATAATTTGG AACCCAAATT TATTACATTA GCAGAAGCAT GCACAAAGAG ATAATGTCAC 2760 

AGTGCTACAA TTAGGTCTTT GTCAGACATT CTGGAGGTTT CCAAAAATAA TATTGTAAAG 282 0 

TTCAATTTCA ACATGTATGT ATATGATGAT TTTTTTCTCA ATTTTGAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAGC CAGTTGTTGC TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AAGGTCTCTA AAGCATCTGC 3 00 0 

TCTTTTTTTT TTTTACGGAT ATTTTAGTAA TAAATATGCT GGATAAATAT TAGTCCAACA 3 0 60 

ATAGGTAAGT TATGCTAATA TCACATTATT ATGTATTCAC TTTAAGTGAT AGTTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAGAAAGTT TTGGAAAAGA AACAATGAAG 3180 

ACTGAATTAA ATTAAAAATG TTGCAGCTCA TAAAGAATTG GGACTCACCC CTACTGCACT 3240 

ACCAAATTCA TTTGACTTTG GAGGCAAAAT GTGTTGAAGT GCCCTATGAA GTAGCAATTT 3300 

TCTATAGGAA TATAGTTGGA AATAAATGTG TGTGTGTATA TTATTATTAA TCAATGCAAT 3360 

ATTTAAAATG AAATGAGAAC AAAGAGGAAA AT GGT AAAAA CTTGAAATGA GGCTGGGGTA 3420 

TAGTTTGTCC TACAATAGAA AAAAGAGAGA GCTTCCTAGG CCTGGGCTCT TAAATGCTGC 3480 

ATTATAACTG AGTCTATGAG GAAAT AGTT C CTGTCCAATT TGTGTAATTT GTTTAAAATT 3540 

GTAAATAAAT TAAACTTTTC TGGTTTCTGT GGGAAGGAAA TAGGGAATCC AATGGAACAG 3600 

TAGCTTTGCT TTGCAGTCTG TTTCAAGATT TCTGCAT CCA CAAGTTAGTA GCAAACTGGG 3 660 

GAAT ACT CGC TGCAGCTGGG GTTCCCTGCT TTTTGGTAGC AAGGGT CCAG AGATGAGGTG 3720 

TTTTTTTCGG GGAGCTAATA ACAAAAACAT TTTAAAACTT ACCTTTACTG AAGTTAAATC 3780 

CTCTATTGCT GTTTCTATTC TCT CTTATAG TGACCAACAT CTTTTTAATT TAGATCCAAA 3 840 

TAACCATGTC CTCCTAGAGT TTAGAGGCTA GAGGGAGCTG AGGGGAGGAT CTTACTGAAA 3900 

GCACCCTGGG GAGATTGATT GTCCTTAAAC CTAAGCCCCA CAAACTTGAC ACCTGATCAG 3 960 

GTCTGGGAGC TACAAAATTT CATTTT TCTC CTCACTGCCC TTCTTCTGAG TGGCATTGGC 4020 

CTGAATCAAG GAAAGCCAGG CCTTGTGGGC CCCCTTCTTT CGGCTTTCTG CTAAAGCAAC 4080 

ACCTCCAGCA GAGATTCCCT TAAGTGACTC CAGGTTTTCC ACCATCCTTC AGCGTGAATT 4140 

AATTTTTAAT CAGTTTGCTT TCT CCAG AGA AATTTTAAAA TAATAGAAGA AATAGAAATT 42 00 

TTGAATGTAT AAAAGAAAAA GATCAAGTTG TCATTTTAGA ACAGAGGGAA CTTTGGGAGA 42 60 

AAGCAGCCCA AGTAGGTTAT TTGTACAGTC AGAGGGCAAC AGGAAGATGC AGGCCTTCAA 432 0 

GGGCAAGGAG AGGCCACAAG GAAT AT GGGT GGGAGTAAAA GCAACATCGT CTGCTTCATA 4380 

CTTTTTCCTA GGCTTGGCAC TGCCTTTTCC TTTCTCAGGC CAATGGCAAC TGCCATTTGA 4440 

GTCCGGTGAG GGAT CAGCC A ACCTCTTCTC TATGGCTCAC CTTATTTGGA GTGAGAAATC 450 0 

AAGGAGACAG AGCTGACTGC ATGATGAGTC TGAAGGCATT TGCAGGATGA GCCTGAACTG 4560 

GTTGTGCAGA ACAAACAAGG CATT CATGGG AATTGTTGTA TTCCTTCTGC AGCCCTCCTT 4620 

CTGGGCACTA AGAAGGTCTA TGAATTAAAT GCCTATCTAA AATTCTGATT TATTCCTACA 4680 

TTTTCTGTTT TCTAATTTGA CCCTAAAATC TATGTGTTTT AGACTTAGAC TTTTTATTGC 4740 

CCCCCCCCCC TTTTTTTTTG AGACGGAGTC TCGCTCTGAC GCACAGGCTG GAGTGCAGTG 4800 

GCTCCGATCT CTGCTCACTG AAAGCTCCGC CTCCCGGGTT CATGCCATTC TCCTGCCTCA 4860 

GCCTCCTGAG TAGCTGGGAC TACAGGCGCC CACCACCACG CCCGGCTAAT TTTTTGTATT 4920 

TTTAATAGAG ACGGGGTTTC ACTGTGTTAG CCAGGATGGT CTCGATCTCC TGACCTCGTG 4980 

ATCCGCCTGC CTCGGCCTCC CAAAGTGCTG GGATTACAGG CATGACCCAC CGCTCCCGGC 5040 

CTTGTTTTCC GTTTAAAGTC GTCTTCTTTT AATGTAATCA TTTTGAACAT GTGTGAAAGT 5100 

TGAT CATACG AATTGG AT CA ATCTTGAAAT ACTCAACCAA AAGACAGTCG AGAAGCCAGG 5160 

GGGAGAAAGA ACTCAGGGCA CAAAATATTG GTCTGAGAAT GGAATTCTCT GTAAGCCTAG 522 0 

TTGCTGAAAT TTCCTGCTGT AAC C AG AAG C CAGTTTTATC TAACGGCTAC TGAAACACCC 52 80 

ACTGTGTTTT GCTCACTCCC TCACTCACCG ATCAAAACCT GCTACCTCCC CAAGACTTTA 5340 

CTAGTGCCGA TAAACTTTCT CAAAGAGCAA CCAGTATCAC TTCCCTGTTT ATAAAACCTC 540 0 

TAACCATCTC TTTGTTCTTT GAACATGCTG AAAACCACCT GGTCTGCATG TATGCCCGAA 5460 

TTTGTAATTC TTTTCTCTCA AATGAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATATGTAGT ATTATTATTT CCTTATATGT GTAAGGTGAA ATTTATGGTA TTTGAGTGTG 5580 

CAAGAAAATA TATTTTTAAA GCTTTCATTT TTCCCCCAGT GAATGATTTA GAATTTTTTA 5640 

TGTAAATATA CAGAATGTTT TTTCTTACTT TTATAAGGAA GCAGCTGTCT AAAATGCAGT 5700 

GGGGTTTGTT TTGCAATGTT TTAAACAGAG TTTTAGTATT GCTATTAAAA GAAGTTACTT 5760 

TGCTTTTAAA GAAACTTGGC TGCTTAAAAT AAGCAAAAAT TGGATGCATA AAGTAATATT 582 0 

TACAGATGTG GGGAGATGTA ATAAAACAAT ATTAACTTGG TTTCTTGTTT TTGCTGTATT 5880 

TAGAGATTAA ATAATTCTAA GATGATCACT TTGCAAAATT ATGCTTATGG CTGGCATGGA 5940 

AATAGAAATA CTCAATTATG TCTTTGTTGT ATTAATGGGG AATATTTTGG ACAATGTTTC 600 0 

ATTAT CAAAT TGTCGACATC ATTAATATAT ATTGTAATGT TGGGAAGAGA TCACTATTTT 6060 

GAAGCACAGC TTTACAGATG AGTATCTATG ATACATATGT ATAATAAATT TTGAT CGGGT 6120 

ATTAAAAGTA TTAGAAGGTG GTTATAATTG CAGAGTATTC CATGAATAGT ACACTGACAC 6180 

AGGGGTTTTA CTTTGAGGAC CAGTGTAGTC AAGGGAAAAC ATGAGTTAAA AAGAAAAGCA 6240 

GGCAATATTG CAGTCTTGAT TCTGCCACTT ACAGGATAGA TAATGCCTGA ACTTTAATGA 6300 

CAAGATGATC CAACCATAAA GGTGCTCTGT GCTTCACAGT GAATCTTTTC CCCATGCAGG 6360 

AGTGTGCTCC CCTACAAACG TTAAGACTGA TCATTTCAAA AATCTATTAG CTATAT CAAA 6420 

AGCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT CCAGTAACTT CTATTGTAAC 6480 

CATTATTTTT GTGTATGTCT TCAAGAATGT TCATTGGATT TTTGTTTGTA ATAGTAAAAT 6540 

ACCGGATACA TTTCACGTGT CCTTCAGTAT TGATTTGGTT GAATATTGGG TCATAATGGT 6600 

TGAGAAGCAT GGACACTAGA GCCAGAATGC TTGGATATGA ATCCTGGATC TGTCACTTAC 6660 

TTCTGTGTGA CCTTTGAAAG GCTACTTATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 6720 

GAACAATGCC AGCCTCATGG GGTTGTTGAA TGATTAAATT AGTTAATATA CCTAAAGTAC 6780 
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ATAGAACACT GCCTGCACAT AGTAAAAGAA TTATAAGTGT GAGGTAGTTG GTAAAATTAT 6840 

GTAGTTGGAT ATACTACCGA ACAATATCTA ATCTCTTTTT AGGGAAATAA AGTTTGTGCA 6 900 
TATATATAAT CCCGAAACAT G 

Seq ID NO: 32 Protein sequence: 
Protein Accession #: NP_001932.1 

1 11 21 31 41 51 

I 1 I I I 1 

MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVILNVP SKLEADKI IG RVNLEECFRS 60 

ADLIRSSDPD FRVLNDGSVY T ARAVAL SDK KRSFTIWLSD KRKQTQKEVT VLL.EHQKKVS 120 

KTRHTRETVL RRAKRRWAPI PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YS I SGRGVDK 180 

EPLNLFYIER DTGNLFCTRP VDREEYDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 240 

PVFTEAIYNF EVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 300 

TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND NAPTFRQNAY 360 

EAFVEENAFN VEILRIPIED KDL INTANWR VNFTILKGNE NGHFKI STDK ETNEGVLSW 420 

KPLNYEENRQ VNLEIGVNNE APFARDIPRV TALNRALVTV HVRDLDEGPE CTPAAQYVRI 480 

KENLAVGSKI NGYKAYDPEN RNGNGDRYKK LHDPKGWITI DEISGSIITS KILDREVETP 540 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKMG YTDILAVDPD 600 

EPVHGAPFYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 660 

TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VCGVFGATKG 720 

KRFPEDLAQQ NLIISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKNGGQETIE 780 

MMKGGNQTLE SCRGAGHHHT hDSCRGGHTE VDNCRYTYSE WHSFTQPRLG EKLHRCNQNE 840 
DRMPSQDYVL TYNYEGRGSP AGSVGCCSEK QEEDGLDFLN NLEPKFITLA EACTKR 



Seq ID NO: 33 DNA sequence 
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GGCAGGTCTC GCTCTCGGCA CCCTCCCGGC GCCCGCGTTC TCCTGGCCCT GCCCGGCATC 60 

CCGATGGCCG CCGCTGGGCC CCGGCGCTCC GTGCGCGGAG CCGTCTGCCT GCATCTGCTG 120 

CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 

CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTGCAGACC TCATCCGGTC AAGTGAT CCT GATTTCAGAG TTCTAAATGA TGGGT CAGTG 300 

TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 42 0 

T CGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 480 

ATTCCTTGCT CTATG CAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACAGTG GAAATCTATT TTGCACTCGG 660 

CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATGCGTC AACTG CAGAT 72 0 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGTGGTTTG TGCCACAGAC AGAGATGAAC CGGACACAAT GCATACGCGC 900 

CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 

AGCACAGGCG TAAT CACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGCACAT CA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 12 00 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 12 60 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 132 0 

GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAACC TGGAAATTGG AGTAAACAAT 1380 

GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTGGTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG AT CAACGGCT ATAAGGCATA TGACCCCGAA 1560 

AATAGAAATG GCAATGGTTT AAG GT AC AAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACCG ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 192 0 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT AT T ACTGT AA AAGACAGGGC CGGCCAAGCT 2 040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC AT CCAACTCA GTGTCGTGCG 2100 

ACTTCAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCAT CAT 2460 

ACCCTGGACT CCTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAGATA CACTTACTCG 252 0 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAGAAT CCATTAGAGG ACACACTGGT 2580 

TAAAAATTAA ACATAAAAGA AATTGCATCG ATGTAATCAG AATGAAGACC GCATGCCATC 2640 

CCAAGATTAT GTCCTCACTT ATAACTATGA GGGAAGAGGA TCTCCAGCTG GTTCTGTGGG 2700 

CTGCTGCAGT GAAAAGCAGG AAGAAGATGG CCTTGACTTT TTAAATAATT TGGAACCCAA 2760 

ATTTATTACA TTAGCAGAAG CATGCACAAA GAGATAATGT CACAGTGCTA CAATTAGGTC 2 820 

TTTGT CAGAC ATTCTGGAGG TTTCCAAAAA TAATATTGTA AAGTT CAATT TCAACATGTA 2880 

TGTATATGAT GATTTTTTTC TCAATTTTGA ATTATGCTAC TCACCAATTT ATATTTTTAA 2 940 

AGCCAGTTGT TGCTTATCTT TTCCAAAAAG TGAAAAATGT TAAAACAGAC AACTGGTAAA 3 000 

T CTCAAACTC CAGCACTGGA ATTAAGGTCT CTAAAGCATC TGCTCTTTTT TTTTTTTACG 3060 

GATATTTTAG TAATAAATAT GCTGGATAAA TATTAGTCCA ACAATAGCTA AGTTATGCTA 312 0 

ATATCACATT ATTATGTATT CACTT TAAGT GATAGTTTAA AAAATAAACA AGAAATATTG 3180 

AGTATCACTA TGTGAAGAAA GTTTTGGAAA AGAAACAATG AAGACTGAAT TAAATTAAAA 3240 

ATGTTGCAGC TCATAAAGAA TTGGGACTCA CCCCTACTGC ACTACCAAAT TCATTTGACT 3300 
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TTGGAGGCAA AATGTGTTGA AGTGCCCTAT GAAGTAGCAA TTTTCTATAG GAATATAGTT 3360 

GGAAATAAAT GTGTGTGTGT ATATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAG 3420 

AACAAAGAGG AAAATGGTAA AAACTTGAAA TGAGGCTGGG GTATAGTTTG T CCTACAAT A 3480 

GAAAAAAGAG AGAGCTTCCT AGGCCTGGGC TCTTAAATGC TGCATTATAA CTGAGTCTAT 3540 

5 GAGGAAATAG TTCCTGTCCA ATTTGTGTAA TTTGTTTAAA ATTGTAAATA AATTAAACTT 3600 

TTCTGGTTTC TGTGGGAAGG AAATAGGGAA TCCAATGGAA CAGTAGCTTT GCTTTGCAGT 3660 

CTGTTTCAAG ATTTCTGCAT CCACAAGTTA GTAGCAAACT GGGGAATACT CGCTGCAGCT 3720 

GGGGTTCCCT GCTTTTTGGT AGCAAGGGTC CAGAGATGAG GTGTTTTTTT CGGGGAGCTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCCTCTATT GCTGTTTCTA 3 840 

10 TTCTCTCTTA TAGTGACCAA CATCTTTTTA AT TTAGATCC AAATAACCAT GTCCTCCTAG 3 900 

AGTTTAGAGG CTAGAGGGAG CTGAGGGGAG GATCTTACTG AAAGCACCCT GGGGAGATTG 3 960 

ATTGTCCTTA AACCTAAGCC CCACAAACTT GACACCTGAT CAGGTCTGGG AGCTACAAAA 4020 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GAGTGGCATT GGCCTGAATC AAGGAAAGCC 4080 

AGGCCTTGTG GGCCCCCTTC TTTCGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATT C 4140 

15 CCTTAAGTGA CTCCAGGTTT TCCACCATCC TTCAGCGTGA ATTAATTTTT AATCAGTTTG 4200 

CTTTCTCCAG AGAAATTTTA AAATAATAGA AGAAATAGAA ATTTTGAATG TATAAAAGAA 4260 

AAAGATCAAG TTGTCATTTT AGAACAGAGG GAACTTTGGG AGAAAGCAGC C CAAGT AG GT 4320 

TATTTGTACA GTCAGAGGGC AACAGGAAGA TGCAGGCCTT CAAGGGCAAG GAGAGGCCAC 4380 

AAGGAATATG GGTGGGAGTA AAAGCAACAT CGTCTGCTTC ATACTTTTTC CTAGGCTTGG 4440 

20 CACTGCCTTT TCCTTTCTCA GGCCAATGGC AACTGCCATT TGAGTCCGGT GAGGGAT CAG 4500 

CCAACCTCTT CTCTATGGCT CACCTTATTT GGAGTGAGAA ATCAAGGAGA CAGAGCTGAC 4560 

TGCATGATGA GTCTGAAGGC ATTTGCAGGA TGAGCCTGAA CTGGTTGTGC AGAACAAA CA 462 0 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTCTGGGCA CTAAGAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTG TTTTCTAATT 4740 

25 TGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCCCCCC CCCTTTTTTT 4800 

TTGAGACGGA GTCTCGCTCT GACGCACAGG CTGGAGTGCA GTGGCTCCGA TCTCTGCTCA 4860 

CTGAAAGCTC CGCCTCCCGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 4920 

GACTACAGGC GCCCACCACC ACGCCCGGCT AATTTTTTGT ATTTTTAATA GAGACGGGGT 4980 

TTCACTGTGT TAG CCAGGAT GGTCTCGATC TCCTGACCTC GTGATCCGCC TGCCTCGGCC 5040 

30 TCCCAAAGTG CTGGGATTAC AGGCATGACC CACCGCTCCC GGCCTTGTTT TCCGTTTAAA 5100 

GTCGTCTTCT TTTAATGTAA TCATTTTGAA CAT GTGT G AA AGTTGATCAT ACGAATTGGA 5160 

TCAATCTTGA AATACTCAAC CAAAAGACAG TCGAGAAGCC AGGGGGAGAA AGAACTCAGG 5220 

GCACAAAATA TTGGT CTGAG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 52 80 

TGTAACCAGA AGCCAGTTTT ATCTAACGGC TACTGAAACA CCCACTGTGT TTTGCTCACT 5340 

35 CCCACTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGTGCCG ATAAACTTTC 5400 

TCAAAGAGCA ACCAGT AT CA CTTCCCTGTT TATAAAACCT CTAACCATCT CTTTGTTCTT 5460 

TGAACATGCT GAAAACCACC TGGT CTG CAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5580 

TCCTTATATG TGTAAGGTGA AATTTATGGT ATTTGAGTGT GCAAGAAAAT ATATTTTTAA 5640 

40 AGCTTTCATT TTTCCCCCAG TGAATGATTT AGAATTTTTT ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACAGA GTTTTAGTAT TGCTATTAAA AGAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT TTACAGATGT GGGGAGATGT 5880 

AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA ATTGGATGCA TAAAGTAATA 5940 

45 TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTTCTTGT TTTTGCTGTA 6000 

TTTAGAGATT AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTAT GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGG GGAATATTTT GGACAATGTT 612 0 

TCATTATCAA ATTGTCGACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT 6180 

TTGAAGCACA GCTTTACAGA TGAGTATCTA TGATACATAT GTATAATAAA TTTTGATCGG 6240 

50 GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAGTAT TCCATGAATA GTACACTGAC 6300 

ACAGGGGTTT TACTTTGAGG ACCAGTGTAG T CAAGGG AAA ACATGAGTTA AAAAGAAAAG 6360 

CAGGCAATAT TGCAGTCTTG ATTCTGCCAC TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCTTCACA GTGAATCTTT TCCCCATGCA 6480 

GGAGTGTGCT CCCCTACAAA CGTTAAGACT GATCATTTCA AAAAT CTATT AG CTAT ATCA 6540 

55 AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCCAGTAAC TTCTATTGTA 6600 

ACCATTATTT TTGTGTATGT CTTCAAGAAT GTT CATTGGA TTTTTGTTTG TAATAGTAAA 6660 

ATACCGGATA CATTTCACGT GTCCTTCAGT ATTGATTTGG TTGAATATTG GGTCATAATG 6720 

GTTGAGAAGC ATGGACACTA GAGCCAGAAT GCTTGGATAT GAATCCTGGA TCTGTCACTT 6780 

ACTTCTGTGT GACCTTTGAA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC AT T AAAAT CA 6840 

60 ATGAACAATG CCAGCCTCAT GGGGT TGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GTGAGGTAGT TGGTAAAATT 6960 

ATGTAGTTGG ATATACTACC GAACAATAT C TAATCTCTTT TTAGGGAAAT AAAGTTTGTG 7020 

CATATATATA AT CCCGAAAC ATG 

65 Seq ID NO: 3 4 Protein sequence: 
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/U MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVILNVP SKLEADKIIG RVNLEECFRS 60 

ADLIRSSDPD FRVLNDGSVY TARAVAL SDK KRSFTIWLSD KRKQTQKEVT VLLEHQKKVS 120 

KTRHTRETVL RRAKRRW AP I PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 180 

EPLNLFYIER DTGNLFCTRP VDREEYDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 240 

PVFTEAIYNF EVLESSRPGT TVGWCATDR DEPDTMHTRIi KYSILQQTPR SPGLFSVHPS 300 

75 TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND NAPTFRQNAY 360 

EAFVEENAFN VEILRIPIED KDLINTANWR VNFT I LKGNE NGHFKI STDK ETNEGVLSW 420 

KPLNYEENRQ VNLEIGVNNE APFARD I PRV TALNRALVTV HVRDLDEGPE CTPAAQYVRI 480 

KENLAVGSKI NG Y KA YD PEN RNGNGIiR Y KK LHDPKGWITI DEISGSIITS KILDREVETP 540 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKMG YTDIIiAVDPD 600 

80 EPVHGAPFYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 660 

TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VCGVFGATKG 720 

KRFPEDLAQQ NLIISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKNGGQETIE 780 

MMKGGNQTLE SCRGAGHHHT LDSCRGGHTE VDNCRYTYSE WHSFTQPRLG EESIRGHTG 
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GGGAGTGGGC GTGGCGGTGC TGCCCAGGTG AGCCACCGCT GCTTCTGCCC AGACACGGTC 60 

5 GCCTCCACAT CCAGGTCTTT GTGCTCCTCG CTTGCCTGTT CCTTTTCCAC GCATTTTCCA 120 

GGATAACTGT GACT CCAGGC CCGCAATGGA TGCCCTGCAA CTAGCAAATT CGGCTTTTGC 180 

CGTTGATCTG TTCAAACAAC TATGTGAAAA GGAGCCACTG GGCAATGTCC TCTTCTCTCC 240 

AATCTGTCTC TCCACCTCTC TGTCACTTGC TCAAGTGGGT GCTAAAGGTG ACACTGCAAA 3 00 

TGAAAT TGGA CAGGTTCTTC ATTTTGAAAA TGT CAAAGAT ATACCCTTTG GATTTCAAAC 360 

10 AGTAACATCG GATGTAAACA AACTTAGTTC CTTTTACTCA CTGAAACTAA TCAAGCGGCT 420 

CTACGTAGAC AAATCTCTGA ATCTTTCTAC AGAGTT CATC AGCTCTACGA AGAGACCCTA 480 

TGCAAAGGAA TTGGAAACTG TTGACTTCAA AGATAAATTG GAAGAAACGA AAGGTCAGAT 540 

CAACAACTCA ATTAAGGATC TCACAGATGG CCACTTTGAG AACATTTTAG CTGACAACAG 600 

TGTGAACGAC CAGACCAAAA TCCTTGTGGT TAATGCTGCC TACTTTGTTG GCAAGTGGAT 660 

15 GAAGAAATTT CCTGAAT GAG AAACAAAAGA ATGTCCTTTC AGACTCAACA AGACAGACAC 72 0 

CAAACCAGTG CAGATGATGA ACATGGAGGC CACGTTCTGT ATGGGAAACA TTGACAGTAT 780 

CAATTGTAAG AT CAT AGAGC TTCCTTTTCA AAATAAGCAT CTCAGCATGT TCATCCTACT 840 

ACCCAAGGAT GTGGAGGATG AGTCCACAGG CTTGGAGAAG ATTGAAAAAC AACTCAACTC 900 

AGAGTCACTG TCACAGTGGA CTAATCCCAG CACCATGGCC AATGCCAAGG TCAAACTCTC 960 

20 CATTCCAAAA TTTAAGGTGG AAAAGATGAT TGATCCCAAG GCTTGTCTGG AAAATCTAGG 102 0 

GCTGAAACAT ATCTTCAGTG AAGACACATC TGATTTCTCT GGAATGTCAG AGACCAAGGG 1080 

AGTGGCCCTA TCAAATGTTA TCCACAAAGT GTGCTTAGAA ATAACTGAAG ATGGTGGGGA 1140 

TTCCATAGAG GTGCCAGGAG CACGGATCCT GCAGCACAAG GATGAATTGA ATGCTGACCA 1200 

TCCCTTTATT TACATCAT CA GGCACAACAA AACTCGAAAC ATCATTTTCT TTGGCAAATT 12 60 

25 CTGTTCTCCT TAAGTGGCAT AGCCCATGTT AAGTCCTCCC TGACTTTTCT GTGGATGCCG 1320 

ATTTCTGTAA ACTCTG CATC CAGAGATTCA TTTTCTAGAT ACAATAAATT GCTAATGTTG 1380 

CTGGAT CAGG AAGCCGCCAG TACTTGTCAT ATGTAGCCTT CACACAGATA GACCTTTTTT 1440 

TTTTTCCAAT TCTATCTTTT GTTTCCTTTT TTCCCATAAG ACAATGACAT ACGCTTTTAA 1500 

TGAAAAGGAA TCACGTTAGA GGAAAAATAT TTATT CATTA TTTGTCAAAT TGTCCGGGGT 1560 

30 AGTTGGCAGA AATACAGTCT TCCACAAAGA AAATTCCTAT AAGGAAGATT TGGAAGCT CT 1620 

TCTTCCCAGC ACTATGCTTT CCTTCTTTGG GATAGAGAAT GTTCCAGACA TTCTCGCTTC 1680 

CCTGAAAGAC TGAAGAAAGT GTAGTGCATG GGACCCACGA AACTGCCCTG GCTCCAGTGA 1740 

AACTTGGGCA CATGCT CAGG CTACTATAGG TCCAGAAGTC CTTATGTTAA GCCCTGGCAG 1800 

GCAGGTGTTT ATT AAAATT C TGAATTTTGG GGATTT T CAA AAGATAATAT TTTACATACA 1860 

35 CTGTATGTTA TAGAACT T CA TGGATCAGAT CTGGGGCAGC AACCTATAAA TCAACACCTT 1920 

AATATGCTGC AACAAAATGT AGAATATTCA GACAAAATGG ATACATAAAG ACTAAGTAGC 1980 

CCATAAGGGG TCAAAATTTG CTGCCAAATG CGTATGCCAC CAACTTACAA AAACACTTCG 2040 

TTCGCAGAGC TTTTCAGATT GTGGAATGTT GGATAAGGAA TTATAGACCT CTAGTAGCTG 2100 

AAATG CAAGA CCCCAAGAGG AAGTTCAGAT CTTAATATAA ATTCACTTTC ATTTTTGATA 2160 

40 GCTGTCCCAT CTGGT CATGT GGTTGGCACT AGACTGGTGG CAGGGGCTTC TAGCTGACTC 2220 

GCACAGGGAT TCTCACAATA GCCGATATCA GAATTTGTGT TGAAGGAACT TGTCTCTTCA 2280 

TCTAATATGA TAGCGGGAAA AGGAGAGGAA ACTACTGCCT TTAGAAAATA TAAGTAAAGT 2340 

GATTAAAGTG CTCACGTTAC CTTGACACAT AGTTTTTCAG TCTATGGGTT TAGTTACTTT 2400 

AGATGGCAAG CATGT AACTT ATATTAATAG TAATTTGTAA AGTTGGGTGG ATAAGCTATC 2460 

45 CCTGTTGCCG GTTCATGGAT TACTTCTCTA TAAAAAATAT ATATTTACCA AAAAATTTTG 2S20 

TGACATTCCT TCTCCCATCT CTTCCTTGAC ATGCATTGTA AATAGGTTCT TCTTGTTCTG 2580 
AGATTCAATA TTGAATTTCT CCTATGCTAT TGACAATAAA ATATTATTGA ACTACC 
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MDALQLANSA FAVDLFKQLC EKEPLGNVLF SPICLSTSLS LAQVGAKGDT ANEIGQVLHF 60 

55 ENVKDIPFGF QTVTSDVNKL SSFYSLKLIK RLYVDKSLNL STEFISSTKR PYAKELETVD 120 

FKDKLEETKG QINNSIKDLT DGHFENILAD NSVNDQTKIL WNAAYFVGK WMKKFPESET 180 

KECPFRLNKT DTKPVQMMNM EATFCMGNID SINCKIIELP FQNKHLSMFI LLPKDVEDES 240 

TGLEKIEKQL NSESLSQWTN PSTMANAKVK LSIPKFKVEK MIDPKACLEN LGLKHIFSED 300 

TSDFSGMSET KGVALSNVIH KVCLEITEDG GDSIEVPGAR ILQHKDELNA DHPFIYIIRH 360 
60 NKTRNIIFFG KFCSP 
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GGAGTGGGGG AGAGAGAGGA GACCAGGACA GCTGCTGAGA CCTCTAAGAA GTCCAGATAC 60 

70 TAAGAGCAAA GATGTTT CAA ACTGGGGGCC TCATTGTCTT CTACGGGCTG TTAGCCCAGA 120 

CCATGGCCCA GTTTGGAGGC CTGCCCGTGC CCCTGGACCA GACCCTGCCC TTGAATGTGA 180 

ATCCAGCCCT GCCCTTGAGT CCCACAGGTC TTGCAGGAAG CTTGACAAAT GCCCTCAGCA 240 

ATGGCCTGCT GTCTGGGGGC CTGTTGGGCA TTCTGGAAAA CCTTCCGCTC CTGGACATCC 300 

TGAAGCCTGG AGGAGGTACT TCTGGTGGCC TCCTTGGGGG ACTGCTTGGA AAAGTGACGT 360 

75 CAGTGATTCC TGGCCTGAAC AACATCATTG ACATAAAGGT CACTGACCCC CAGCTGCTGG 420 

AACTTGGCCT TGTGCAGAGC CCTGATGGCC ACCGTCTCTA TGTCACCATC CCTCTCGGCA 480 

TAAAGCTCCA AGTGAATACG CCCCTGGTCG GTGCAAGTCT GTTGAGGCTG GCTGTGAAGC 540 

TGGACATCAC TGCAGAAATC TTAGCTGTGA GAGATAAGCA GGAGAGGATC CACCTGGTCC 600 

TTGGTGACTG CACCCATTCC CCTGGAAGCC TGCAAATTTC TCTGCTTGAT GGACTTGGCC 660 

80 CCCTCCCCAT TCAAGGTCTT CTGGACAGCC TCACAGGGAT CTTGAATAAA GTCCTGCCTG 720 

AGTTGGTTCA GGGCAACGTG TGCCCTCTGG TCAATGAGGT TCTCAGAGGC T TGGA CAT CA 780 

CCCTGGTGCA TG AC AT T GT T AACATGCTGA TCCACGGACT ACAGTTTGTC ATCAAGGT CT 840 

AAGCCTTCCA GGAAGGGGCT GGCCTCTGCT GAGCTGCTTC CCAGTGCTCA CAGATGGCTG 900 

GCCCATGTGC TGGAAGATGA CACAGTTGCC TTCTCTCCGA GGAACCTGCC CCCTCTCCTT 960 

85 TCCCACCAGG CGTGTGTAAC ATCCCATGTG CCTCACCTAA TAAAATGGCT CTTCTTCTGC 1020 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 
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MFQTGGLIVF YGLLAQTMAQ FGGLPVPLDQ TLPLNVNPAL PLSPTGLAGS LTNALSNGLL 60 
SGGLLG I LEN LPLLDILKPG GGTSGGLLGG LLGKVTSVIP GLNNIIDIKV TDPQLLELGL 120 
VQSPDGHRLY VTIPLGIKLQ VNTPLVGASL LRLAVKLDIT AEILAVRDKQ ERIHLVLGDC 180 
THSPGSLQIS LLDGLGPLPI QGLLDSLTGI LNKVLPELVQ GNVCPLVNEV LRGLDITLVH 240 
DIVNMLIHGL QFVIKV 
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CTCAGGGCAG AGGGAGGAAG G ACAG C AG AC CAGACAGT C A CAGCAGCCTT GACAAAACGT 60 

TCCTGGAACT CAAGCTCTTC TCCACAGAGG AGGACAGAGC AGACAGCAGA GACCATGGAG 120 

TCTCCCTCGG CCCCTCCCCA CAGATGGTGC ATCCCCTGGC AGAGGCTCCT GCTCACAGCC 180 

TCACTTCTAA CCTTCTGGAA CCCGCCCACC ACTGCCAAGC TCACTATTGA ATCCACGCCG 240 

TTCAATGTCG CAGAGGGGAA GGAGGTGCTT CTACTTGTCC ACAATCTGCC CCAGCATCTT 3 00 

TTTGGCTACA GCTGGTACAA AGGTGAAAGA GTGGATGGCA ACCGTCAAAT TATAGGATAT 3 60 

GTAATAGGAA CTCAACAAGC TACCCCAGGG CCCGCATACA GTGGTCGAGA GATAATATAC 420 

CCCAATGCAT CCCTGCTGAT CCAGAACATC AT CCAGAATG ACACAGGATT CTACACCCTA 480 

CACGTCATAA AGTCAGATCT TGTGAATGAA GAAGCAACTG GCCAGTTCCG GGTATACCCG 540 

GAGCTGCCCA AGCCCTCCAT CTCCAGCAAC AACTCCAAAC CCGTGGAGGA CAAGGATGCT 600 

GTGGCCTTCA CCTGTGAACC TGAGACTCAG GACGCAACCT ACCTGTGGTG GGTAAACAAT 660 

CAGAGCCTCC CGGTCAGTCC CAGGCTGCAG CTGTCCAATG GCAACAGGAC CCTCACTCTA 72 0 

TTCAATGTCA CAAGAAATGA CACAGCAAGC TACAAATGTG AAACCCAGAA CCCAGTGAGT 780 

GCCAGGCGCA GTGATTCAGT CATCCTGAAT GTCCTCTATG GCCCGGATGC CCCCACCATT 840 

TCCCCTCTAA ACACATCTTA CAGAT CAGGG GAAAATCTGA ACCTCTCCTG CCACGCAGCC 900 

TCTAACCCAC CTGCACAGTA CTCTTGGTTT GTCAATGGGA CTTTCCAGCA ATCCACCCAA 960 

GAGCTCTTTA TCCCCAACAT CACTGTGAAT AATAGTGGAT CCTATACGTG CCAAGCCCAT 102 0 

AACTCAGACA CTGGCCTCAA TAGGACCACA GTCACGACGA TCACAGTCTA TGCAGAGCCA 1080 

CCCAAACCCT TCATCACCAG CAACAACTCC AACCCCGTGG AGGATGAGGA TGCTGTAGCC 1140 

TTAACCTGTG AACCTGAGAT T C AG AACAC A ACCTACCTGT GGTGGGTAAA TAATCAGAGC 1200 

CTCCCGGTCA GTCCCAGGCT GCAGCTGTCC AATGACAACA GGACCCTCAC TCTACTCAGT 12 60 

GTCACAAGGA ATGATGTAGG ACCCTATGAG TGTGGAATCC AGAACGAATT AAGTGTTGAC 1320 

CACAGCGACC CAGTCATCCT GAATGTCCTC TATGGCCCAG ACGACCCCAC CATTTCCCCC 1380 

TCATACACCT ATTACCGTCC AGGGGTGAAC CTCAGCCTCT CCTGCCATGC AGCCTCTAAC 1440 

CCACCTGCAC AGTATTCTTG GCTGATTGAT GGGAACATCC AGCAACACAC ACAAGAGCTC 1500 

TTTATCTCCA ACATCACTGA GAAGAACAGC GGACTCTATA CCTGCCAGGC CAATAACTCA 1560 

GCCAGTGGCC ACAGCAGGAC TACAGTCAAG ACAATCACAG TCTCTGCGGA GCTGCCCAAG 162 0 

CCCTCCATCT CCAGCAACAA CTCCAAACCC GTGGAGGACA AGGATGCTGT GGCCTTCACC 1680 

TGTGAACCTG AGGCTCAGAA CACAACCTAC CTGTGGTGGG TAAATGGTCA GAGCCTCCCA 1740 

GTCAGTCCCA GGCTGCAGCT GTCCAATGGC AACAGGACCC T CACTCT ATT CAATGTCACA 180 0 

AGAAATGACG CAAGAGCCTA TGTATGTGGA AT CCAGAACT CAGTGAGTGC AAACCGCAGT 1860 

GACCCAGTCA CCCTGGATGT CCTCTATGGG CCGGACACCC CCATCATTTC CCCCCCAGAC 192 0 

TCGTCTTACC TTTCGGGAGC GAACCTCAAC CTCTCCTGCC ACTCGGCCTC TAACCCATCC 1980 

CCGCAGTATT CTTGGCGTAT CAATGGGATA CCGCAGCAAC ACACACAAGT TCTCTTTATC 2 040 

GCCAAAATCA CGCCAAATAA TAACGGGACC TATGCCTGTT TTGTCTCTAA CTTGGCTACT 210 0 

GGCCGCAATA AT TCCATAGT CAAGAGCAT C ACAGTCTCTG CATCTGGAAC TTCTCCTGGT 2160 

CTCTCAGCTG GGGCCACTGT CGGCATCATG ATTGGAGTGC TGGTTGGGGT TGCTCTGATA 222 0 

TAGCAGCCCT GGTGTAGTTT CTTCATTTCA GGAAGACTGA CAGTTGTTTT GCTTCTTCCT 22 80 

TAAAGCATTT GCAACAGCTA CAGTCTAAAA TTGCTTCTTT ACCAAGGATA TTTACAGAAA 2340 

AGACTCTGAC C AG AG AT CG A GACCATCCTA GCCAACATCG TGAAACCCCA TCTCTACTAA 2400 

AAATACAAAA ATGAGCTGGG CTTGGTGGCG CGCACCTGTA GTCCCAGTTA CTCGGGAGGC 2460 

TGAGGCAGGA GAATCGCTTG AACCCGGGAG GTGGAGATTG CAGTGAGCCC AGATCGCACC 252 0 

ACTGCACTCC AGTCTGGCAA C AG AG C AAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTGACCTGT ACTCTTGAAT ACAAGTTTCT GATACCACTG CACTGTCTGA GAATTTCCAA 2 640 

AACTTTAATG AACTAACTGA CAGCTTCATG AAACTGTCCA CCAAGATCAA GCAGAGAAAA 2 700 

TAATTAATTT CATGGGACTA AATGAACTAA TGAGGATTGC TGATTCTTTA AATGT CTTGT 2760 

TTCCCAGATT TCAGGAAACT TTTTTTCTTT TAAGCTATCC ACTCTTACAG CAATTTGATA 282 0 

AAATATACTT TTGTGAACAA AAATTGAGAC ATTTACATTT TCTCCCTATG TGGTCGCTCC 2880 

AGACTTGGGA AACTATTCAT GAATATTTAT ATTGTATGGT AATATAGTTA TTGCACAAGT 2940 
TCAATAAAAA TCTGCTCTTT GTATAACAGA AAAA 

Seq ID NO: 40 Protein sequence: 
Protein Accession #: NP_ 004354.1 

1 11 21 31 41 51 

I I 1 I I I 

MESPSAPPHR WCIPWQRLLL TASLLTFWNP PTTAKLTIES TPFNVAEGKE VLLLVHNLPQ 60 

HLFGYSWYKG ERVDGNRQII GYVIGTQQAT PGPAYSGRE I IYPNASLLIQ NIIQNDTGFY 12 0 

TLHVIKSDLV NEEATGQFRV YPELPKPSIS SNNSKPVEDK DAVAFTCEPE TQDATYLWWV 180 

NNQSLPVSPR LQLSNGNRTL TLFNVTRNDT ASYKCETQNP VSARRSDSVI LNVLYGPDAP 240 

TISPLNTSYR SGENLNLSCH AASNPPAQYS WFVNGTFQQS TQELFIPNIT VNNSGSYTCQ 300 

AHNSDTGLNR TTVTTITVYA EPPKPFITSN NSNPVEDEDA VALTCEPEIQ NTTYLWWVNN 360 

QSLPVSPRLQ LSNDNRTLTL LSVTRNDVGP YECGIQNELS VDHSDPVILN VLYGPDDPTI 42 0 

SPSYTYYRPG VNLSLSCHAA SNPPAQYSWL IDGNIQQHTQ ELFISNITEK NSGLYTCQAN 480 

NSASGHSRTT VKTITVSAEL PKPSISSNNS KPVEDKDAVA FTCEPEAQNT TYLWWVNGQS 540 

LPVSPRLQLS NGNRTLTLFN VTRNDARAYV CGIQNSVSAN RSDPVTLDVL YGPDTPIISP 60 0 

PDSSYLSGAN LNLSCHSASN PSPQYSWRIN GIPQQHTQVL FIAKITPNNN GTYACFVSNL 660 
ATGRNNS I VK SITVSASGTS PGLSAGATVG IMIGVLVGVA LI 



204 



WO 02/086443 



PCT/US02/12476 



Seq ID NO: 41 DNA sequence 
Nucleic Acid Accession #: NM_006952.1 
^ Coding sequence: 11-793 

1 11 21 31 , 41 51 

I ' I I I I 1 

AATCCCGACA ATGGCGAAAG ACAACTCAAC TGTTCGTTGC TTCCAGGGCC TGCTGATTTT 60 

TGGAAATGTG ATTATTGGTT GTTGCGGCAT TGCCCTGACT GCGGAGTGCA TCTTCTTTGT 120 

10 ATCTGACCAA CACAGCCTCT ACCCACTGCT TGAAGCCACC GACAACGATG ACATCTATGG 180 

GGCTGCCTGG ATCGGCATAT TTGTGGGCAT CTGCCTCTTC TGCCTGTCTG TTCTAGGCAT 240 

TGTAGGCATC ATGAAGTCCA GCAGGAAAAT TCTTCTGGCG TATTTCATTC TGATGTTTAT 300 

AGTATATGCC TTTGAAGTGG CATCTTGTAT CACAGCAGCA ACACAACGAG ACTTTTTCAC 360 

ACCCAACCTC TTCCTGAAGC AGATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 42 0 

15 TGATGACCAG TGGAAAAACA ATGGAGTCAC CAAAACCTGG GACAGGCTCA TGCTCCAGGA 480 

CAATTGCTGT GGCGTAAATG GTCCAT CAGA CTGGCAAAAA TACACATCTG CCTTCCGGAC 540 

TGAGAATAAT GATGCTGACT ATCCCTGGCC TCGTCAATGC TGTGTTATGA ACAATCTTAA 600 

AGAACCTCTC AACCTGGAGG CTTGTAAACT AGGCGTGCCT GGTTTTTATC ACAAT CAGGG 660 

CTGCTATGAA CTGATCTCTG GTCCAATGAA CCGACACGCC TGGGGGGTTG CCTGGTTTGG 720 

20 AT TTGCCATT CTCTGCTGGA CTTTTTGGGT TCTCCTGGGT ACCATGTTCT ACTGGAGCAG 7 80 
AAT TGAATAT TAAGAA 



25 



Seq ID NO: 42 Protein sequence: 
Protein Accession #: NP_008883.1 



1 11 21 31 41 51 

1 1 I 1 1 I 

MAKDNSTVRC FQGLLIFGNV IIGCCGIALT AECIFFVSDQ HSLYPLLEAT DNDD I YGAAW 60 

IGIFVGICLF CLSVLGIVGI MKSSRKILLA YFILMFIVYA FEVASCITAA TQRDFFTPNL 12 0 

30 FLKQMLERYQ NNSPPNNDDQ WKNNGVTKTW DRLMLQDNCC GVNGPSDWQK YTSAFRTENN 180 

DADYPWPRQC CVMNNLKEPL NLEACKLGVP GFYHNQGCYE h I SGPMNRHA WGVAWFGFAI 240 
LCWTFWVLLG TMFYWSRIEY 

35 Seq ID NO: 43 DNA sequence 

Nucleic Acid Accession #: Eos sequence 
Coding sequence: 83-2605 

1 11 21 31 41 51 

40 1 | | I | | 

GCCGGACAGA TCTGCGCGTA TCCTGGAGCC GGCCCAGTTG TGAACTAGGA GAGCTTTGGG 60 

ACCTCTGTCC CAAGCAAGAG AGATGAATGG AGAGTATAGA GGCAGAGGAT TTGGACGAGG 120 

AAGATTTCAA AGCTGGAAAA GGGGAAGAGG TGGTGGGAAC TTCT CAGGAA AATGGAGAGA 180 

AAGAGAACAC AGACCTGATC TGAGTAAAAC CACAGGAAAA CGTACTTCTG AACAAACCCC 240 

45 ACAGTTTTTG CTTTCAACAA AGACCCCACA GTCAATGCAG TCAACATTGG AT CGATTCAT 300 

ACCATATAAA GGCTGGAAGC TTTATTTCTC TGAAGTTTAC AGCGATAGCT CTCCTTTGAT 360 

TGAGAAGATT CAAGCATTTG AAAAATTTTT CACAAGGCAT ATTGATTTGT ATGACAAGGA 420 

TGAAATAGAA AGAAAGGGAA GTATTTTGGT AGATTTTAAA GAACTGACAG AAGGTGGTGA 480 

AGTAACTAAC TTGATACCAG ATATAGCAAC TGAACTAAGA GATGCACCTG AGAAAACCTT 54 0 

50 GGCTTGCATG GGTTTGGCAA TACATCAGGT GTTAACTAAG GACCTTGAAA GGCATGCAGC 600 

TGAGTTACAA GCCCAGGAAG GATTGTCTAA TGATGGAGAA ACAATGGTAA ATGTGCCACA 660 

TATTCATGCA AGGGTGTACA ACTATGAGCC TTTGACACAG CTCAAGAATG TCAGAGCAAA 720 

TTACTATGGA AAATACATTG CTCTAAGAGG GACAGTGGTT CGTGT CAGTA ATATAAAGCC 780 

TCTTTGCACC AAGATGGCTT TTCTTTGTGC TGCATGTGGA GAAATTCAGA GCTTTCCTCT 840 

55 TCCAGATGGA AAATACAGTC TTCCCACAAA GTGTCCTGTG CCTGTGTGTC GAGGCAGGTC 900 

ATTTACTGCT CTCCGCAGCT CTCCTCTCAC AGTTACGATG GACTGGCAGT CAATCAAAAT 960 

CCAGGAATTG ATGTCTGATG ATCAGAGAGA AGCAGGTCGG ATTCCACGAA CAATAGAATG 1020 

TGAGCTTGTT CATGATCTTG TGGATAGCTG TGTCCCGGGA GACACAGTGA CTATTACTGG 10 80 

AATTGT CAAA GTCTCAAATG CGGAAGAAGG TTCTCGAAAT AAGAATGACA AGTGTATGTT 1140 

60 CCTTTTGTAT ATTGAAGCAA ATTCTATTAG TAATAGCAAA GGACAGAAAA CAAAGAGTT C 12 0 0 

TGAGGATGGG TGTAAGCATG GAATGTTGAT GGAGTTCTCA CTTAAAGACC TTTATGCCAT 1260 

CCAAGAGATT CAAGCTGAAG AAAACCTGTT TAAACTCATT GTCAACTCGC TTTGCCCTGT 13 2 0 

CATTTTTGGT CATGAACTTG TTAAAGCAGG TTTGGCATTA GCACTCTTTG GAGGAAGCCA 13 80 

GAAATACGCA GATGACAAAA ACAGAATTCC AATTCGGGGA GACCCCCACA TCCTTGTTGT 1440 

65 TGGAGATCCA GGCCTAGGAA AAAGTCAAAT GCTACAGGCA GCGTGCAATG TTGCCCCACG 1500 

TGGCGTGTAT GTTTGTGGTA ACACCACGAC CACCTCTGGT CTGACGGTAA CTCTTTCAAA 1560 

AGATAGTTCC TCTGGAGATT TTGCTTTGGA AGCTGGTGCC CTGGTACTTG GTGATCAAGG 1620 

TATTTGTGGA ATCGATGAAT TTGATAAGAT GGGGAATCAA CATCAAGCCT TGTTGGAAGC 1680 

CATGGAGCAG CAAAGTATTA GTCTTGCTAA GGCTGGTGTG GTTTGTAGCC TTCCTGCAAG 1740 

70 AACTTCCATT ATTGCTGCTG CAAATCCAGT TGGAGGACAT TACAATAAAG CCAAAACAGT 1800 

TTCTGAGAAT TTAAAAATGG GGAGTGCACT ACTATCCAGA TTTGATTTGG TCTTTATCCT 1860 

GTTAGATACT CCAAATGAGC ATCATGATCA CTTACTCTCT GAACATGTGA TTGCAATAAG 1920 

AGCTGGAAAG CAGAGAACCA TTAGCAGTGC CACAGTAGCT CGTATGAATA GTCAAGATTC 1980 

AAATACTTCC GTACTTGAAG TAGTTTCTGA GAAGCCATTA TCAGAAAGAC TAAAGGTGGT 2 040 

75 TCCTGGAGAA AC AAT AGAT C CCATTCCCCA CCAGCTATTG AGAAAGTACA TTGGCTATGC 2100 

TCGGCAGTAT GTGTACCCAA GGCTATCCAC AGAAGCTGCT CGAGTTCTTC AAGATTTTTA 2160 

CCTTGAGCTC CGGAAACAGA GCCAGAGGTT AAATAGCTCA CCAATCACTA CCAGGCAGCT 2220 

GGAATCTTTG ATTCGTCTGA CAGAGGCACG AGCAAGGTTG GAATTGAGAG AGGAAGCAAC 2280 

CAAAGAAGAC GCTGAGGATA TAGTGGAAAT TATGAAATAT AGCATGCTAG GAACTTACTC 2340 

80 TGATGAATTT GGGAACCTAG ATT TTGAGCG AT CCCAGCAT GGTTCTGGAA TGAGCAACAG 2400 

GTCAACAGCG AAAAGATTTA TTTCTGCTCT CAACAACGTT GCTGAAAGAA CTTATAATAA 2460 

TATATTTCAA TTTCATCAAC TTCGGCAGAT TGCCAAAGAA C T AAA CAT T C AGGTTGCTGA 2520 

TTTTGAAAAT TTTATTGGAT CACTAAATGA CCAGGGTTAC CTCTTGAAAA AAGGCCCAAA 2580 

AGTTTACCAG CTTCAAACTA TGTAAAAGGA CTTCACCAAG TTAGGGCCTC CTGGGTTTAT 2640 

85 TGCAGATTAA AGCCATCTCA GTGAAGATAT GCGTGCACGC ACAGACAGAC AGACACACAC 2700 

ACACACACAC ACACACACAC ACACACACAC ACACACAGT C AAATACTGTT CTCTGAAAAA 2760 

TGATGTCCCA AAAGTATTAT AATAGGAAAA AAG CAT T AAA TATAATAAAC TAATTTAAGA 2820 



205 



10 



15 



20 



25 



30 



WO 02/086443 

AGTGATAAAG TCTCCAGATG CAGTAGCTCA CACTGTAATC ACAGTGACTC AGGAGGCTGA 2 880 

GGTGAGAGGA TTCCTTGAGG CCAGGGTTCG AGACCAACCT TGGGCAACAT AGCAAGACCC 2940 

CATTTCTTAA AAAAAAAAAA AAAAAATTTA AACTTAGCTG GGTATGGTGG CACATGCCTA 3000 

TAGTCTCAGC TACTTGTGAG GCTGAGGCAG GAGGATT CTT TGAGCCCAGG AGTTTGAGGT 3060 

TACAGTGAGC CACAATCACA CCAAT CACTG CACTCCAGCC TGGGCAATAA AGTAACTCTT 3120 

GACTCAAAAA AATAAAAAAA ATTGTAGTGG TAGCCATGTG TTAATTGTTA AATAAATTCT 3180 

CCAAAGGGCT AAAAGTAAAT TACTTATAAA TTTTTTATAG TTGTATTTTT GACCTGCCTT 3240 

TTATATGTAT GAATATTTCA TAGTTTTGCA TATCAGATGT AGGCATACAG ACAAATACAT 3300 

AAACCAATGA ATATATTACA TATTCTGTGT TCCAATAAAA CTTTATTTAT GGACACTAAA 3360 

ATTTGAATTT CATAAAATTT TCCCATGTCA AGAATACAAA ATACTTGAGT TTTGTTTTTA 3420 

GCTATTTAAT AATAGGTCTC ATTTATTCCA CAGGCTGTAG TTTGTAGTCT TGCTTGAAAC 3480 

AATAGAAACA GACTGATTAA GCAGGAGAAG TTTTTTGAAA GAATTTTGTT TGGCTCACGG 3 540 

AATTATTAGA AGGCAGGTGA ACCAGGAGGG TAAGCTTCCA GCAGCAATTT GTAAAACCAT 3 600 

GCCTTAGAAT TGGACTAAGG AAGAAGCTGG TGACACTCCA CTGCCACACA GGGCACTGGA 3660 

AGAAAGTGCT GCTGCCTCCC TGCCCCACCT TTGCCACTTC TGCAGCAGGA ATAGGTAGAA 372 0 

GAATGCCCCC ACCCGCACCG GAACAGCAAC AAAAGGATTC TGCATGAGAT GCCTCCCTAA 3780 

ATTGCTGAAT TCAAAAAAGA AGTTGCATAC AAAGACATCT GATTGAAAAA GGGTATGTTA 3840 

TATGCCCCTT TCATAGGCTG CTAGGGAGTT TTCCTGGTTC TACTTTCAGG TGGTGGGATC 3 900 

AATAAGACCA GAATTTCTCA TATGTTGTGA GAGGAT TCAA ATGTTACAGG GTTGCCAGCC 3 960 

AAACTATCAA TCATGTATAA AT C CAAC AAA CACTTTGTAA CATACAAGAA CTCAGGAAAT 402 0 

GTGAACCATT GTTGGAGAAT CTACTAAAAT ACGGCTTCCC GCAAACGAAG ATGAATGGAA 4080 

AATGTAAATA AAAAGAACTG GCAGTGTATA TCAGATGTTT AACTATAGGA CCAGAACTAA 4140 

GATGTGGAGA CTATTGCCAT AGACCACAAT GTAAATTTTT AAGTGAGGAA GGAAAAATCA 4200 

GGAAT CAAAA GGGGCCAGGT GCAGTGGCTC ACATCTATAA TCCCAGAGCT TTGGGAGTTC 42 60 

GAGGCAGGAG GATCACTTGA AGCCAGTTTT GAGACCAGCC TATGCAACAC ATTGAGACCC 4320 

TATCTCTACA AAAAATAGAT TAGCTGGGCA CGGTGGTGCA TGCCTATTGT CCTACCTACT 43 80 

GTGGAGGCTG AAGTAGGAAA TCACTTGAGC CCGAGAGTTT GAGGTTACAG TGAGCTATGA 4440 
TTATACCACT GCACTCCAGC CTGGGCAAGA GAGCAAGACC TTGTCTCTT 

Seq ID NO: 44 Protein sequence: 
Protein Accession CAB55276.2 



PCT/US02/12476 



35 
40 
45 
50 
55 
60 
65 
70 
75 



MNGEYRGRGF 
TPQSMQSTLD 
ILVDFKELTE 
L SNDGETMVN 
LCAACGEIQS 
QREAGRIPRT 
SISNSKGQKT 
KAGLALALFG 
TTTTSGLTVT 
LAKAGWCSIi 
HDHLLSEHVI 
IPHQLLRKYI 
EARARLELRE 
SALNNVAERT 



11 
1 

GRGRFQSWKR 
RF I PYKGWKL 
GGEVTNLIPD 
VPHIHARVYN 
FPLPDGKYSL 
IECELVHDIiV 
KSSEDGCKHG 
GSQKYADDKN 
LSKDSSSGDF 
PARTS I I AAA 
A I RAGKQRT I 
GYARQYVYPR 
EATKEDAED I 
YNNIFQFHQL 



21 
i 

GRGGGNFSGK 
YFSEVYSDSS 
IATELRDAPE 
YEPLTQLKNV 
PTKCPVPVCR 
DSCVPGDTVT 
MLMEFSLKDL 
RIPIRGDPHI 
ALEAGALVLG 
NPVGGHYNKA 
SSATVARMNS 
LSTEAARVLQ 
VEIMKYSMLG 
RQIAKELNIQ 



31 

I 

WREREHRPDL 
PLIEKIQAFE 
KTLACMGLAI 
RANYYGKYIA 
GRSFTALRSS 
ITGIVKVSNA 
YAIQEIQAEE 
LWGDPGLGK 
DQGICGIDEF 
KTVSENLKMG 
QDSNTSVLEV 
DFYLELRKQS 
TYSDEFGNLD 
VADFENFIGS 



41 
I 

SKTTGKRTSE 
KFFTRHIDLY 
HQVLTKDLER 
LRGTWRVSN 
PLTVTMDWQS 
EEGSRNKNDK 
NLFKLIVNSL 
SQMLQAACNV 
DKMGNQHQAL 
SALLSRFDLV 
VSEKPLSERL 
QRLNSSPITT 
FERSQHGSGM 
LNDQGYLLKK 



51 
I 

QTPQFLLSTK 
DKDE I ERKGS 
HAAELQAQEG 
I KPLCTKMAF 
IKIQELMSDD 
CMFLLYIEAN 
CPVIFGHELV 
APRGVYVCGN 
LEAMEQQSIS 
FILLDTPNEH 
KWPGETIDP 
RQLESLIRLT 
SNRSTAKRFI 
GPKVYQLQTM 



Seq ID NO: 45 DNA sequence 

Nucleic Acid Accession #: NM_005416.1 

Coding sequence: 149.. 658 



1 

i 

ACCAGATCCC 
CTGAAGACCA 
AAAGAGTGTG 
CCCACCACCT 
AATATTTGTT 
AAAGATTCCA 
GCCAGGCTGT 
CAAGGTCCCT 
ACCAGGCAGC 
CAAAGTTCCT 
GCCATGTCCT 
TGGTGCACAG 
TGTTT CTGTG 
AGTCTCTCTC 
CTGAAGAATC 
GGCTGCTCAG 
CTCATTAAAT 



11 

i 

AGAGGCTGAA 
GAAAAGCCAC 
TCCACGATCC 
CAGCTTCAAC 
CCCACAACCA 
GAGCCAGGCT 
ACCAAGGTCC 
GAGCCAGGTT 
ATCAAGGTCC 
GAGCAAGGAT 
TCAACGGTCA 
ACAAGCCCTT 
TCTTAATTGT 
TTATTTGTAT 
CTGTAAGCCC 
GGTTCATCTG 
TGCTTTTAAT 



21 
I 

CACCTCGACC 
TAAGACTTTC 
TTTGAAGCAT 
AGCAGCAGGT 
AGGAGCCATG 
GTACCAAGGT 
CTGAGCCAGG 
GTACCAAGGT 
CTGACCAAGG 
ACACCAAAGT 
CTCCAGGCCC 
GAGAAGCCAA 
CTGTAGACCT 
CCTAAAAATA 
CTGAATTAAG 
AAGATTCGAA 
TCCA 



31 
I 

TTCTCTGCAC 
TGCTTAATTC 
GAGTT CTT AC 
GAAACAACCC 
CCACTCAAAG 
CCCTGAGCCA 
TTGTACCAAG 
CCCTGAGCCA 
CTTCATCAAG 
TCCTGTGCCA 
AGCTCAGCAG 
CCACCAGATG 
TGTAATCAGC 
CGTACTATAA 
CAGAAAGTCT 
TGAAAAGAAA 



41 
I 

AGCAGATGAT 
AGGAGCTTAG 
CAGCAGAAGC 
AGCCAGCCTC 
GTTCCACAAC 
GGCTGTACCA 
GTCCCTGAGC 
GGCTACACCA 
TTTCCTGAGC 
GGCTACACAA 
AAGACCAAGC 
CTGGACACCC 
ACATTGTCAC 
AGCTTTTGTT 
TCATGGCTTT 
TGCATGTTTC 



51 
I 

CCCTGAGCAG 
AGGATTCTTC 
AGACCTTTAC 
CACCTCAGGA 
CTGGAAACAC 
AGGTCCCTGA 
CAGGCTGTAC 
AGGTCCCTGA 
CAGGTGCCAT 
AGCTACCAGA 
AGAAGTAATT 
TCTTCCCATC 
CCCAAGCCAT 
CACACACACT 
TCTGGTCTTC 
CTGCTCTTCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



Seq ID NO: 46 Protein sequence: 
Protein Accession #: NP_005407.1 



1 11 21 31 41 51 

80 1 | | | | | 

MSSYQQKQTF TPPPQLQQQQ VKQPSQPPPQ EIFVPTTKEP CHSKVPQPGN TKIPEPGCTK 
VPEPGCTKVP EPGCTKVPEP GCTKVPEPGC TKVPEPGCTK VPEPGYTKVP EPGSIKVPDQ 
GFIKFPEPGA IKVPEQGYTK VPVPGYTKLP EPCPSTVTPG PAQQKTKQK 

85 Seq ID NO: 47 DNA sequence 

Nucleic Acid Accession #: Eos sequence 



60 
120 



206 



10 



WO 02/086443 

1 11 21 31 41 SI 

I 1 I I I i 

GCGTCGTGTG CAGGCGTCCC CGGGCTGTGG ATAATTAGAC ACGTTCTTCC CTCATTGCCC 60 

AAGGCTCGTT AGAATTCGCC CTAGAGCTGT ATCATGTATT TTCTTTCAAA TTAACTTTGC 120 

TTGCAATTAA GCTTAGGGAA CCAGCAACAA AAGCAAACTT GGCCCGAGGT CGTTCACCGC 180 

GAAAATGGAT TAGAGAAACT TCTTCCCCGA TTTAAGGGGA AAGATTCCTG CGGCCAGCGC 240 

TTTGGGGAAA GTGCCCCGAC CGCAGAGGCG ACGACAGGGG AGCAGGAAGC TGCTCACGGT 300 

AGT CGGCGTT GGCGGCAGCG GTGGCCTTCC TCATCTGGGC GATGTGGGCT CCTAGAAGAG 360 

TAAGGATAAC AT CCTGGAAA TGACTTCTGT ACGGTTTGAG CCCAACTGCA CACTCATGAC 420 

TTGGAGCTGC CCTGTGGAGT TACAGTTTAC CAAACACATT CATGAACATA ATCTCATTTA 480 
CTAAAAACTT TGTGAGAATT TTCTTTTACT AAAATTTTTT CTTATTACAA A 



PCT/US02/12476 



15 



Seq ID NO: 4 8 DNA sequence: 

Nucleic Acid Accession #: CAT cluster 



11 



21 



31 



41 



51 



20 



25 



30 



TTCCAAATTT 
TTTTAGTAAA 
CTCCAAGTCA 
TCCTTACTCT 
CCGACTACCG 
CCCAAAGCGC 
ATTTTCGCGG 
TTGCAAGCAA 
AG CCTTGGGC 
CGACGCT 



TTTTTTTTGT 
TGAGATTATG 
TGAGTGTGCA 
TCTCGGAGCC 
TGAGCAGCTT 
TGGCCGCAGG 
TGAACGACCT 
AGTTAATTTG 
AATGAGGGAA 



AATAAGAAAA 
TTCATGAATG 
GTTGGGCTCA 
CACATCGCCC 
CCTGCTCCCC 
AATCTTTCCC 
CGGGCCAAGT 
AAAGAAAATA 
GAACGTGTCT 



AATTTTAGTA 
TGTTTGGTAA 
AACCGTACAG 
AGATGAGGAA 
TGTCGTCGCC 
CTTAAATCGG 
TTGCTTTTGT 
CATGATACAG 
AGTTATCCAC 



AAAGAAAATT 
ACTGTAACTC 
AAGTCATTTC 
GGCCACCGCT 
TCTGCGGTCG 
GGAAGAAGTT 
TGCTGGTTCC 
CTCTAGGGCG 
AGCCCGGGGA 



CTCACAAAGT 
CACAGGGCAG 
CAGGATGTTA 
GCCGCCAACG 
GGGCACTTTC 
TCTCTAATCC 
CTAAGCTTAA 
AATTCTAACG 
CGCCTGCACA 



60 
120 
180 
240 
300 
360 
420 
480 
540 



Seq ID NO: 4 9 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



i 
I 

TCTTTCTTCT 
CCTGCCGACC 
ACCGTAGACC 
GACACATGCA 
CCCAACCAAA 
TTCATTTAAA 
CTTTCTCTGA 
TCGCTGTAGC 
TAAAGAAACT 
GATTGAACCA 
TGCTGGTATC 
GACGTGGGGA 
CCGGGTCTCT 
GAGACAGCAT 
ACTG 



11 
) 

GCTGCTCGTT 
TCTGTTGTCT 
CGAAACCATT 
GACCAGTTTT 
GTGTTTAAAA 
AAACTCTAAT 
TCTGTGTCTT 
CATGGGAATC 
GACACAGGAG 
GTGCACTCCA 
GTCCTGCAGC 
GAGCTGGTCT 
CCTGGCCCCG 
CATTTATGAG 



21 
1 

TGTCTCTCCT 
CTTCTCTGAT 
GGGTGTCACA 
CCTGGAACNG 
CTTTTTAGGG 
ATTTATATTA 
TTTTCTTTGA 
CGTTT CATTA 
AATCACTTGA 
GCCTTGGCAG 
CCCATCCTCG 
ATATATCCGG 
GGGACCTAGT 
CCTGCAGCAT 



31 
I 

GTGCTCTTCT 
GGCGGGGGGC 
AGCCGGTCGC 
CATGACCATG 
CACCCCCAAA 
AATACAAAGA 
CAGCATCTCG 
TTATGGTAGC 
ACTTGGGAGG 
CGGAGCAAGA 
GTTCCATTGC 
GTGAAGCTCA 
ATTTTTGCCA 
CCACCCTACT 



41 
I 

TCTTTCTTTC 
GGGAGAAGCT 
CGGCTTTTTT 
TTATTACTAT 
ATTTTTTTTT 
TACCCAAACC 
ATTTTTTTTC 
AATATGGAGT 
CAGAGTTTGC 
TTCTGTCACA 
GCTGCCAGGC 
GCTGTGGCAC 
CGAGTGTACA 
GCTGTATCCA 



Seq ID NO: 50 DNA sequence 
Nucleic Acid Accession #: L05187 
Coding sequence: 1991.. 2260 



CTGCAGGGAG 
TCAGAAAGGA 
CAGAAGAAGG 
TGAAGGAAAG 
AGAGTCATAA 
GGAAATGGAT 
ATTTCTAGCT 
CCCCTCCCTT 
ACAACCATCT 
CCAGGGTTAA 
CCCTGCACCT 
CAGCCAGCTA 
GATGAGGATG 
AGCTTCTATT 
TCACACCAAA 
ATTGCAACAA 
ATATGTGTAA 
TATTTTAAGT 
CCTCAGTAGA 
AGTTCATAGC 
TGACAAGATA 
ATTTAAGGCA 
AACATAAAAC 
AGTAATTGGC 
AGGAGACCTC 
AGATGGGAAG 
GAGGCTTAGA 
GAGGAAAGTG 
GAAGCCAGCT 
GAGCCAAGAA 
TTCAAAGGGC 



11 
1 

GCAGGTAGAA 
GGAAAAGGCC 
ATTAGCCCCT 
CAGGTT TTCC 
GTAAATTATT 
GGAAGGTCTT 
TCCACCTTCA 
TCCCACCTAT 
CAATGACAAG 
CT CATGAAAC 
GGGT CTGAGG 
GTGCCAAAAA 
GTAGTGTGAG 
TCCTTGAGGC 
CCCAAGGGAC 
ACTGGCAATT 
GCAGGTTAAT 
TAAATTACAG 
TAGTCATTGA 
AGAACTAGAA 
TTTATAGAAA 
GTATGCTAGG 
CTAGCAGGAA 
ATGACGGAGA 
TAGGGTGTCA 
AAAAG CAT T T 
TGAATATAAA 
GTCTGATGCC 
TTAGTAGGGC 
GAGAACTCCA 
CTGAAAATTA 



21 

I 

AAGGCTTTTG 
AGGGCAGATG 
GAAAGTCCCT 
CAGATTAGCA 
CTGAATGTGT 
GGACTCTGAG 
CCAAGGCAGA 
TCATGTGTGC 
GACAGCAGGT 
CCTCCATGAA 
ATGAGGGTGG 
ATATCAGGTG 
TCATGTGTGA 
AGGGCTCATT 
CACACAGCCC 
CTAGTGTACT 
CCAGGGTTTC 
TCTGGATTTG 
ACTGGGAGTC 
CTCAGGCCAG 
TTTTAATTTA 
CACTTTGGAC 
GGTAATACAT 
TGGGCAGAGA 
AGTGATGTGA 
GGAAGGGACT 
GCCATCCTAT 
ATTTTCCAAA 
ATTTTTCCAG 
ATAAAATGGA 
TCCAAGCTTA 



31 
I 

GGTTTTCAGG 
TCTGGGTGGA 
GAAGTAGGAG 
ACCAGTCAGG 
GTAGTTTAAT 
ACAAGGGGTC 
CAAGGAGGGC 
AAGAGTGCCC 
GGCAAGGCTC 
GCCTGCTGCT 
CAGTGAAAAT 
GTGTTCATCA 
CAGGTGAGGA 
CATCTTATAA 
ATTCTGCTCC 
TTTTCATTAT 
AATGGGAGAT 
AAAGGACCTT 
CTGGAGAAGA 
AGCACTCTCA 
TTAGATGGAT 
AAAT CAATGC 
ATATATAAAT 
AGGGCTGTGC 
GCTATGATGG 
GTGTAAGCAC 
AAGTCACAGG 
AGACCTAATA 
AACAGATATA 
GCAGAAGAAA 
TTTCATTTTT 



41 
I 

TGGGGGGCAG 
GTGAAGGGAA 
AAGGGTAAAG 
GGGAGGAAGG 
GGAATTGGGA 
TATAATCAGT 
CCACCTCAGC 
TGTCCCACAG 
AACAGGACTC 
CACCCCTCCC 
TAGGCCAGTG 
AATAAGCCGA 
ATGAAAACAG 
AAGCCAGCTG 
GTATACCAGG 
TAGAAATTAG 
AGAGAATAGT 
AGAGATGGTT 
TTGTTCAAAT 
GTAACACTGC 
CTCTACTGAG 
CCTAACGTAC 
AAATGAAATG 
ACTTTTGGGA 
AGGGGTATTT 
AGACCAGAAG 
CTTTCTACAT 
TGCGGACCTC 
AGGTGCCTTG 
TTGCCTTTTA 
AAATGTAATG 



51 

] 

CCTCGCCGCT 
GACCGGTGAG 
GGGAGAACCC 
GGGCCGCCTC 

CTTTATGCTT 
TGCTGCTTCA 
GCTGTATTCC 
AGTGAGCCGA 
GTTCCTGAAG 
AGGGTGCTGG 
ACCTTGGATG 
CCAAACAAAG 
GTTTCCATTG 



51 

I 

TCTAGCCTGA 
AAAGTG AT CC 
GTGTGGTTGG 
TGAGAGTGGG 
AAAAGATGGG 
CCATTTCATT 
TCCTCTGCTC 
AACACGGGGA 
AGATGTCCCC 
TCAAGGCAAG 
ACATCATTTT 
GCCAACCGGT 
AGTGCCCGAG 
GCCATTGCCT 
TAAGTCTCTG 
CTAAAGGCAA 
GGAATATCTT 
AGGGCTCCCA 
GCCCATGGGA 
AATTTCCCCC 
CATTTATTCC 
TTACTTAACA 
CAAAGTAGAT 
GACTTGCTCA 
GGACAAGCAG 
CAAAACCATA 
GGTACTAGGA 
ATGTCCCTCA 
GGTAGGAAGG 
GCTCCTCCTC 
GGGGAGCTAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
72 0 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



207 



5 
10 
15 
20 



WO 02/086443 

GGGAGATGAA AGGCTTTCTC TTCTAAAGGG 
TTGTATCCAT CTTTCTTTAA TTGAATCACT 
CATTTGAAGC ATGAATTCTC AGCAGCAGAA 
GCAGCAGCAG GTGAAACAAC CTTGCCAGCC 
CAAGGAGCCC TGCCAACCCA AGGTGCCTGA 
CCAGCCCAAG ATTCCAGAGC CCTGCCAGCC 
CACTCCAGCA CCAGCCCAGC AGAAGACCAA 
TTGAGGAGCT GGCCACTGGA TACTGAACAC 
GCCTATTGAC CCTGCAGTTA GCATGCTGTC 
CTAAAAAGAT GTCCCTTACC CTCATTCTGG 
GTCTCACTGA CTGAGCTAGT CTTCTTGTTG 
AGGT CAAGTG ACCATCCCTA G 

Seq ID NO: 51 Protein sequence: 
Protein Accession #:AAC26838 



PCT/US02/12476 



TCCTGAAATA 
GTGTCAGCTT 
GCAGCCTTGC 
TCCACCCCAG 
GCCCTGCCAC 
CAAGGTGCCT 
GCAGAAGTAA 
CCTACTCCAT 
ACCCTGAATC 
AGGCTCCTGA 
CTCGGGTGCA 



AAATCTGTTT 
TCTGTCTCTA 
ACCCCACCCC 
GAACCATGCA 
CCCAAAGTGC 
GAGCCCTGCC 
TGTGGTCCAC 
TCTGCTTATG 
ATAAT CGCTC 
GCCTCTGCGT 
TTTGAGGATG 



GGCATTGAAT 
GAAAAAAACA 
CTCAGCCTCA 
TCCCCAAAAC 
CTGAGCCCTG 
CTTCAACGGT 
AGCCATGCCC 
AATCCCATTT 
CTTTGCACCT 
AAGGCTGAAC 
GATTTGGGGA 



11 



21 



31 



41 



51 

I 1 I I I I 

MNSQQQKQPC TPPPQPQQQQ VKQPCQPPPQ EPCIPKTKEP CQPKVPEPCH PKVPEPCQPK 
IPEPCQPKVP EPCPSTVTPA PAQQKTKQK 



1920 
1980 
2040 
2100 
2160 
2220 
22S0 
2340 
2400 
2460 
2520 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 52 DNA sequence 

Nucleic Acid Accession #: NM_002638.1 

Coding sequence: 12 0-473 



CAATACAGCT 
GCTGGACTGC 
TGAGGGCCAG 
AGGCAGCTGT 
TCAATGGACA 
CGCAAGAGCC 
TCCGGTGCGC 
TCAAGAAGTG 
CGGTCCTTGC 
TGCTGCCCTT 
GAGCTGCCTC 



11 
I 

AAGGAATTAT 
ATAAAGATTG 
CAGCTTCTTG 
CACGGGAGTT 
AGATCCCGTT 
AGT CAAAGGT 
GATGT TGAAT 
CTGTGAAGGC 
TGCACCTGTG 
CCCCTTCCCA 
TCTCATCCAC 



21 

I 

CCCTTGTAAA 
GTATGGCCTT 
ATCGTGGTGG 
CCTGTTAAAG 
AAAGGACAAG 
CCAGTCTCCA 
CCCCCTAACC 
TCTTGCGGGA 
CCGTCCCCAG 
CACTGTCCAT 
TTT CCAATAA 



31 

) 

TACCACAGAC 
AGCTCTTAGC 
TGTTCCTCAT 
GTCAAGACAC 
TTTCAGTTAA 
CTAAGCCTGG 
GCTGCTTGAA 
TGGCCTGTTT 
AGCTACAGGC 
TCTTCCTCCC 
A 



41 
I 

CCGCCCTGGA 
CAAACACCTT 
CGCTGGGACG 
TGTCAAAGGC 
AGGTCAAGAT 
CTCCTGCCCC 
AGATACTGAC 
CGTTCCCCAG 
CCCATCTGGT 
ATTCAGGATG 



51 
I 

GCCAGGCCAA 
CCTGACACCA 
CTGGTTCTAG 
CGTGTTCCAT 
AAAGTCAAAG 
ATTATCTTGA 
TGCCCAGGAA 
TGAAGGGAGC 
CCTAAGTCCC 
CCCACGGCTG 



Seq ID NO: 53 Protein sequence: 
Protein Accession #: NP_002629.1 

1 11 



31 



41 



51 



21 

I I I I I I 

MRASSFLIW VFLIAGTLVL EAAVTGVPVK GQDTVKGRVP FNGQDPVKGQ VSVKGQDKVK 
AQEPVKGPVS TKPGSCPIID IRCAMLNPPN RCLKDTDCPG IKKCCEGSCG MACFVPQ 

Seq ID NO: 54 DNA sequence 

Nucleic Acid Accession #: NM_019618 

Coding sequence: 75-584 



GGCACGAGCC 
GAGACAACCA 
ATCAAT CAAT 
CCCTTCAGGG 
TTGCTGTTAT 
ATT TGGGAAT 
CATTGCAGCT 
CCTTCCTTTT 
CGGACTGGTT 
GGAAGTCATA 
GCAGCTTGGT 
AGTGTCATTT 
TAATGAAGAA 
GGAGAGCTGG 
CTGCATGAGT 
TGAAGATGCT 
CTCTGTTTCT 
CCAATATACC 
TAATTCTTGT 
AATAAACTTT 



11 

! 

ACGATTCAGT 
CACTATGAGA 
GTGTAAACCT 
TCAGAACCTT 
CACATGCAAG 
CCAGAAT CC A 
AAAAG AG C AG 
CTACCGTGCC 
CATTGCCTCC 
CAACACTGCC 
CTTTGTCTTA 
TCACGCTGGT 
GAAGCAATTA 
GTGGTATAAG 
GACTTTAAGA 
TCAGAGCTCA 
GTTTTGCTTT 
TCATTGTGTG 
GTTAAGTTAA 
GTGTATTTAT 



21 
I 

CCCCTGGACT 
GGCACTCCAG 
ATTACTGGGA 
GTGGCAGTTC 
TATCCAGAGG 
GAAATGTGTT 
AAGATCATGG 
AAGACTGGTA 
TCCAAGAGAG 
TTTGAATTAA 
AAGTTTCTGG 
GCTGAGACAG 
CTTCATAGCA 
GCTGTCCTCT 
CTCAAAGACC 
TGCGCGTTAC 
ATTCCCTCTT 
TAATAGAACC 
ATCATTTTTG 
ATAATAAAAA 



31 
I 

GTAGATAAAG 
GAGACGCTGA 
CTATTAATGA 
CACGAAGTGA 
CTCTTGAGCA 
TGTATTGTGA 
ATCTGTATGG 
GGACCTCCAC 
ACCAGCCCAT 
ATATAAATGA 
TTCCCAATGT 
GGGCAAGGCT 
ACTGAAGAAC 
CAAGCTGGTG 
AAACACTGAG 
CCACGATGGC 
GGGATGATAT 
TTCTTAGCAT 
TCCTAATTGT 
AAAAAAAAAA 



41 

I 

ACCCTTTCTT 
TGGTGGAGGA 
TTTGAATCAG 
CAGTGTGACC 
AGGCAGAGGG 
GAAGGTTGGA 
CCAACCCGAG 
CCTTGAGTCT 
CATTCTGACT 
CTGAACT CAG 
GTTTTCGTCT 
GCTGTTATCA 
AGGATGTGGC 
CTGTGTAGGC 
CTTTCTTCTA 
ATGACTAGCA 
CATCCAGTCT 
TAAGACCTTG 
AATGTGTAAT 
AAA 



51 

I 

GCCAGGTGCT 
AGGGCCGTCT 
CAAGTGTGGA 
CCAGTCACTG 
GATCCCATTT 
GAACAGCCCA 
CCCGTGAAAC 
GTGGCCTTCC 
TCAGAACTTG 
CCTAGAGGTG 
ACATTTTCTT 
TCTCATTTTA 
CTCAGAAGCA 
CACAAGGCAT 
GGGGTGGGTA 
CAGAGCTGAT 
TTATATGTTG 
TAAACAAAAA 
CTTAAAGTTA 



Seq ID NO: 55 Protein sequence: 
Protein Accession #: NP_062564 



11 



21 



31 

I 



41 



51 



MRGTPGDADG GGRAVYQSMC KPITGTINDL NQQVWTLQGQ NLVAVPRSDS VTPVTVAVIT 
CKYPEALEQG RGDPIYLGIQ NPEMCLYCEK VGEQPTLQLK EQKIMDLYGQ PEPVKPFLFY 
RAKTGRTSTL ESVAFPDWFI ASSKRDQPII LTSELGKSYN TAFELNIND 

Seq ID NO: 56 DNA sequence 

Nucleic Acid Accession #: NM_003125 

Coding sequence: 65-334 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 



208 



WO 02/086443 



PCT/US02/12476 



10 



15 



AGCAGTTCTA 
CAGCATGAGT 
GCAGGTGAAA 
GCCCTGCCAC 
CAAGCTTCCA 
AGCACCAGCC 
AGCCGGCCAC 
CAATTAGCAT 
TCTGAGTCTC 
ATTCATCTGA 
AAATTCACTT 



11 
I 

AGGGACCATA 
TCCCAGCAGC 
CAGCCTTGCC 
CCCAAGGTGC 
GAGCCATGCC 
CAGCAGAAGA 
CAGATGCTGA 
TCTGTCTCCC 
TGAATGAAGC 
AGAGAGACTT 
TCAATTCCA 



21 
I 

CAGAGTATTC 
AGAAGCAGCC 
AGCCTCCACC 
CTGAGCCCTG 
ACCCCAAGGT 
CCAAGCAGAA 
ATCCCCTATC 
CCAAAAAAGA 
TGAAGGTCTT 
AAGATGAAAG 



31 
I 

CTCTCTTCAC 
CTGCATCCCA 
TCAGGAACCA 
CCACCCCAAA 
GCCTGAGCCC 
GTAATGTGGT 
CCATTCTGTG 
ATGTGCTATG 
AGTACCAGAG 
CAAATGATTC 



41 

1 . 

ACCAGGACCA 
CCCCCTCAGC 
TGCATCCCCA 
GTGCCTGAGC 
TGCCCTTCAA 
CCACAGCCAT 
TATGAGTCCC 
AAGCTTTCTT 
CTAGTTTTCA 
AGCTCCCTTA 



51 
I 

GCCACTGTTG 
TTCAGCAGCA 
AAACCAAGGA 
CCTGCCAGCC 
TAGTCACTCC 
GCCCTTGAGG 
ATTTGCCTTG 
TCCTACACAC 
GCTGCTCAGA 
TACCCCCATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 57 Protein sequence: 
Protein Accession #: NP_003116 

1 11 21 31 41 51 

| I I 1 I I 

MSSQQQKQPC IPPPQLQQQQ VKQPCQPPPQ EPCIPKTKEP CHPKVPEPCH PKVPEPCQPK 
LPEPCHPKVP EPCPSIVTPA PAQQKTKQK 

Seq ID NO: 58 DNA sequence 

Nucleic Acid Accession #: NM_001793.2 

Coding sequence: 71-2560 



60 



AAAGGGGCAA 
CTCTGCAGCC 
CTGGCTGCAG 
CTTGGAGGCG 
CTGCCCTGGG 
TGGCGAGACA 
ATCCAAACGT 
TGAAAATGGC 
AGACACCAAG 
CTTCGCTGTA 
GATTGCCAAG 
CCCCATGAAC 
GGACACCTTC 
GACAGCCACG 
CCATAGCCAA 
CACCATCAGC 
CATCCAGGCC 
GATCCTTGAT 
GCCTGAGAAT 
CAACTCACCA 
TACCATCACC 
TTTTGAGGCC 
GCTGAAGCTC 
ACCTGTGTTT 
GCCTGTGTGT 
CATCCTGAGA 
TGTGGGCACC 
GGTCTTGGCC 
ACTGATTGAT 
CCAAAGCCCT 
CCCTTTCCAG 
GGAAGGTGAC 
GCACCTTTCT 
GTGCGACTGC 
CCCTGTGCTG 
GAGAAAGAAG 
CGTCTTCTAC 
GCTCCACCGA 
CATCATCCCG 
TATAATTGAG 
CTTGGTGTTC 
CTCCGCCTCC 
GAAGCTGGCA 
GGGACCAAAC 
GACTTCGGAG 
ACGTTAGAGT 
AGCACTGAAA 
TCTTACCTGC 
TACAGTGGAC 
TTTTTTTAAT 
GCTGGGCCCA 
TGGATCTCTG 
GTTGCGTTGC 
TAAAGAAACT 



11 

I 

GAGCTGAGCG 
ATGGGGCTCC 
TGCGCGGCCT 
GGAGGCGCGG 
CAAGAGCCAG 
GTCCAGGAAA 
ATCTTACGAA 
AAGGGTCCCT 
ATTTTCTACA 
GAGAAGGAGA 
TATGAGCTCT 
AT CTCCAT CA 
CGAGGGAGTG 
GATGAGGATG 
GAACCAAAGG 
GTCATCTCCA 
ACAGACATGG 
GCCAATGACA 
GCAGTGGGCC 
GCGTGGCGTG 
ACCCACCCTG 
AAAAACCAGC 
CCAACCTCCA 
GTCCCACCCT 
GTCTACACTG 
GACCCAGCAG 
CTCGACCGTG 
ATGGACAATG 
GTCAATGACC 
GTGCGCGAGG 
GCCCAGCTCA 
ACAGTGGTCT 
CTGTCTGACC 
CATGGCCATG 
GGGGCTGTCC 
CGGAAGATCA 
TATGGCGAAG 
GGTCTGGAGG 
ACACCCATGT 
AACCTGAAGG 
GACTATGAGG 
GACCAAGACC 
GACATGTACG 
GTCAGGCCAC 
CTTGTCAGGA 
GGTTGCTTCC 
ACCTCTCCAC 
CGTAAAATGC 
TTTCTCTCTG 
GCTATCTTCA 
CTGGCCGT CC 
CGTTTTTATA 
TATAGATGAA 
TTTCCCAGAA 



21 
I 

GAACACCGGC 
CTCGTGGACC 
CCGAGCCGTG 
AGCAGGAGCC 
CTCTGTTTAG 
GAAGGTCACT 
GACACAAGAG 
TCCCCCAGAG 
GCATCACGGG 
CAGGCTGGTT 
TTGGCCACGC 
TCGTGACCGA 
TCTTAGAGGG 
ATGCCATCTA 
ACCCACACGA 
GTGGCCTGGA 
ATGGGGACGG 
ATGCTCCCAT 
ATGAGGTGGA 
CCACCTACCT 
AGAGCAACCA 
ACACCCTGTA 
CAGCCACCAT 
CCAAAGTCGT 
CAGAAGACCC 
GGTGGCTAGC 
AGG AT GAG CA 
GAAGCCCTCC 
ATGG CCCAGT 
TGCTGAACAT 
CAGATGACTC 
TGTCCCTGAA 
ATGGCAACAA 
TCGAAACCTG 
TGGCTCTGCT 
AGGAGCCCCT 
AGGGGGGTGG 
CCAGGCCGGA 
ACCGTCCTCG 
CGGCTAACAC 
GCAGCGGCTC 
AAGATTACGA 
GTGGCGGGGA 
AG AGCATCT C 
AGTGGCCGTA 
TTAGCCTTTC 
CTGGGCCAGG 
TCAACCCTGT 
GAATGGAACC 
AAACGTTAGA 
TGCATTTCTG 
CTGAGTGTGC 
GGGTGAGGAC 
AAAAA 



31 

1 

CCGCCGTCGC 
TCTCGCGTCT 
CCGGGCGGTC 
CGGCCAGGCG 
CACTGATAAT 
GAAGGAAAGG 
AGATTGGGTG 
ACTGAAT GAG 
GCCGGGGG CA 
GTTGTTGAAT 
TGTGTCAGAG 
CCAGAATGAC 
AGTCCTACCA 
CACCTACAAT 
CCTCATGTTC 
CCGGG AAAAA 
CTCCACCACC 
GTTTGACCCC 
GAGGCTGACG 
TATCATGGGC 
GGGCATCCTG 
CGTTGAAGTG 
AGTGGT CCAC 
TGAGGTCCAG 
T G AC AAGG AG 
CATGGACCCA 
GTTTGTGAGG 
CACCACTGGC 
GCCTGAGCCC 
CACGGACAAG 
AGACAT CT AC 
GAAGTTCCTG 
AGAGCAGCTG 
CCCTGGACCC 
GTTCCTCCTG 
CCTACTCCCA 
CGAAGAGGAC 
GGTGGTTCTC 
GCCAGCCAAC 
AGACCCCACA 
CGACGCCGCG 
TTATCTGAAC 
GGACGACTAG 
CAAGGGGTCT 
GCAACTTGGC 
AGGATGGAGG 
GTTGCCTCAG 
GTCCTGGGCC 
TTCTTAGGCC 
GAAAGTTCTT 
GTTTCCAGAC 
CTAGGTTGCC 
AATCGTGTAT 



41 
I 

GGCAGCTGCT 
CTCCTCCTTC 
TTCAGGGAGG 
CTGGGGAAAG 
GATGACTTCA 
AATCCATTGA 
GTTGCTCCAA 
CTCAAGTCTA 
GACAGCCCCC 
AAGCCACTGG 
AATGGTGCCT 
CACAAGCCCA 
GGTACTTCTG 
GGGGTGGTTG 
ACCATTCACC 
GTCCCTGAGT 
ACGGCAGTGG 
CAGAAGTACG 
GTCACTGATC 
GGTGACGACG 
ACAACCAGGA 
ACCAACGAGG 
GTGGAGGATG 
GAGGGCATCC 
AATCAAAAGA 
GACAGTGGGC 
AACAACATCT 
ACGGGAACCC 
CGT CAGATCA 
GACCTGTCTC 
TGGACGGCAG 
AAGCAGGATA 
ACGGTGATCA 
TGGAAGGGAG 
CTGGTGCTGC 
GAAGATGACA 
CAGGACTATG 
CGCAATGACG 
CCAGATGAAA 
GCCCCGCCCT 
TCCCTGAGCT 
GAGTGGGGCA 
GCGGCCTGCC 
CAGTTCCCCC 
GGAGACAGGC 
AATGTGGGCA 
AGGCCAAGTT 
TGGGCCTGCT 
TCCTGGTGCA 
CAAAAGTGCA 
CCCAATGCCT 
CCTTATTTTT 
ATGTACTAGA 



51 

I 

TCACCCCTCT 
TCCAGGTTTG 
CTGAAGTGAC 
TATT CATGGG 
CTGTGCGGAA 
AGATCTTCCC 
TATCTGTCCC 
ATAAAGATAG 
CTGAGGGTGT 
ACCGGGAGGA 
CAGTGGAGGA 
AGTTTACCCA 
TGATGCAGGT 
CTTACTCCAT 
GGAGCACAGG 
ACACACTGAC 
CAGTAGTGGA 
AGGCCCATGT 
TGGACGCCCC 
GGGACCATTT 
AGGGTTTGGA 
CCCCTTTTGT 
TGAATGAGGC 
CCACTGGGGA 
TCAGCTACCG 
AGGTCACAGC 
ATGAAGTCAT 
TTCTGCTAAC 
CCATCTGCAA 
CCCACACCTC 
AGGTCAACGA 
CATATGACGT 
GGGCCACTGT 
GTTTCATCCT 
TTTTGTTGGT 
CCCGTGACAA 
ACATCACCCA 
TGGCACCAAC 
TCGGCAACTT 
ACGACACCCT 
CCCTCACCTC 
GCCGCTTCAA 
TGCAGGGCTG 
TTCAGCTGAG 
TATGAGTCTG 
GTTTGACTTC 
TCCAGAAGCC 
GTGACTGACC 
ACTTAATTTT 
GCCCAGAGCT 
CCCATTCGGA 
TATTTTCCCT 
ACTTTTTTAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 



Seq ID NO: 59 Protein sequence: 



209 



WO 02/086443 

Protein Accession #: NP_0 01784.2 



PCT/US02/12476 



1 11 21 31 41 51 

c I I 1 I I I 

D MGLPRGPLAS LLLLQVCWLQ CAASEPCRAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 60 

QEPALFSTDN DD FT VRNGET VQERRSLKER NPLKIFPSKR I LRRHKRDWV VAPISVPENG 120 

KGPFPQRLNQ LKSNKDRDTK IFYSITGPGA DSPPEGVFAV EKETGWLLLN KPIiDREE I AK 180 

YELFGHAVSE NGASVEDPMN ISIIVTDQND HKPKFTQDTF RGSVLEGVLP GTSVMQVTAT 240 

DEDDAIYTYN GWAYSIHSQ EPKDPHDLMF T1HRSTGTIS VISSGLDREK VPEYTLTIQA 300 

10 TDMDGDGSTT TAVAWEILD ANDNAPMFDP QKYEAHVPEN AVGHEVQRLT VTDLDAPNSP 360 

AWRATYL I MG GDDGDHFTIT THPESNQGIL TTRKGLDFEA KNQHTLYVEV TNEAPFVLKL 420 

PTSTATIWH VEDVNEAPVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKE NQKISYRILR 480 

DPAGWLAMDP DSGQVTAVGT LDREDEQ FVR NN I YEVMVLA MDNGSPPTTG TGTLLLTLID 540 

VNDHGPVPEP RQITICNQSP VRQVLNITDK DI/SPHTSPFQ AQLTDDSDIY WTAEVNEEGD 60 0 

15 TWLSLKKFL KQDTYDVHLS LSDHGNKEQL TVIRATVCDC HGHVETCPGP WKGGFILPVL 660 

GAVLALLFLL LVLLLLVRKK RKIKEPLLLP EDDTRDNVFY YGEEGGGEED QDYDITQLHR 72 0 

GLEARPEWL RNDVAPTIIP TPMYRPRPAN PDEIGNFIIE NLKAANTDPT APPYDTIiLVF 7 80 
DYEGSGSDAA SLSSLTSSAS DQDQDYDYLN EWGSRFKKIiA DMYGGGEDD 

20 Seq ID NOi 60 DNA sequence 

Nucleic Acid Accession #: Eos sequence 
Coding sequence: 162-428 

„ 1 11 21 31 41 51 

25 | i i I ] I 

GCGTTCCGTT GGCGGCGGAT TCGAACGTTC GGACTGAGGT TTTTCTGCCT GAAGAAGCGT 60 

CATACGGACC GGATTGTTTT CGCTGGCCCA GTGTCCCCGG AGCTTGTGTG CGATACAGAG 12 0 

AGCACCT CGG AAGCTGAGGC AGCTGGTACT TGACAGAGAG GATGGCGCTG T CGACCAT AG 180 

TCTCCCAGAG GAAG CAGAT A AAGCGGAAGG CTCCCCGTGG CTTTCTAAAG CGAGTCTTCA 240 

30 AGCGAAAGAA GCCTCAACTT CGTCTGGAGA AAAGTGGTGA CTTATTGGTC CATCTGAACT 300 

GTTTACTGTT TGT T CAT CGA TTAGCAGAAG AGTCCAGGAC AAACGCTTGT GCGAGTAAAT 360 

GTAGAGTCAT TAACAAGGAG CATGTACTGG CCGCAGCAAA GGTAATTCTA AAGAAGAGCA 420 

GAGGTTAGAA GTCAAAGAAC ATATTCTTGA AAGTTATGAT GCATTCTTTT GGGTGGTAAC 480 
AG AT CAT AAA GACATTTTTT ACACAT C AGT TAATATGGGA TTATTAAATA TTGG 
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Seq ID NO: 61 Protein sequence: 
Protein Accession #: Eos sequence 
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40 j | ! | | | 

MALSTIVSQR KQIKRKAPRG FLKRVFKRKK PQLRLEKSGD LLVHLNCLLF VHRLAEESRT 60 
NACASKCRVI NKEHVLAAAK VILKKSRG 

Seq ID NO: 62 DNA sequence 
45 Nucleic Acid Accession #: NMJ300094.2 
Coding sequence: 99-8933 

1 11 21 31 41 51 

<; ft 1 I 1 1 I I 

jU GGGCTGGAGG GGCGCTGGGC TCGGACCTGC CAAGGCCACC GCAGGGGGGA GCAAGGGACA 60 

GAGGCGGGGG TCCTAGCTGA CGGCTTTTAC TGCCTAGGAT GACGCTGCGG CTTCTGGTGG 120 

CCGCGCTCTG CGCCGGGATC CTGGCAGAGG CGCCCCGAGT GCGAGCCCAG CACAGGGAGA 180 

GAGTGACCTG CACGCGCCTT TACGCCGCTG ACATTGTGTT CTTACTGGAT GGCTCCTCAT 240 

CCATTGGCCG CAGCAATTTC CGCGAGGTCC GCAGCTTTCT CGAAGGGCTG GTGCTGCCTT 300 

55 TCTCTGGAGC AGCCAGTGCA CAGGGTGTGC GCTTTGCCAC AGTGCAGTAC AGCGATGACC 360 

CACGGACAGA GTTCGGCCTG GATGCACTTG GCTCTGGGGG TGATGTGATC CGCGCCATCC 420 

GTGAGCTTAG CTACAAGGGG GGCAACACTC GCACAGGGGC TGCAATTCTC CATGTGGCTG 480 

ACCATGTCTT CCTGCCCCAG CTGGCCCGAC CTGGTGTCCC CAAGGTCTGC ATCCTGATCA 540 

CAGACGGGAA GTCCCAGGAC CTGGTGGACA CAGCTGCCCA AAGGCTGAAG GGGCAGGGGG 600 

60 TCAAGCTATT TGCTGTGGGG AT CAAGAATG CTGACCCTGA GGAGCTGAAG CGAGTTGCCT 660 

CACAGCCAAC CTCCGACTTC TTCTTCTTCG TCAATGACTT CAGCATCTTG AGGACACTAC 720 

TGCCCCTCGT TTCCCGGAGA GTGTGCACGA CTGCTGGTGG CGTGCCTGTG ACCCGACCTC 780 

CGGATGACTC GACCTCTGCT CCACGAGACC TGGTGCTGTC TGAGCCAAGC AGCCAATCCT 840 

TGAGAGTACA GTGGACAGCG GCCAGTGGCC CTGTGACTGG CTACAAGGTC CAGTACACTC 900 

65 CTCTGACGGG GCTGGGACAG CCACTGCCGA GTGAGCGGCA GGAGGTGAAC GTCCCAGCTG 960 

GTGAGACCAG TGTGCGG CTG CGGGGTCTCC GGCCACTGAC CGAGTACCAA GTGACTGTGA 102 0 

TTGCCCTCTA CGCCAACAGC AT CGGGGAGG CTGTGAGCGG GACAGCTCGG ACCACTGCCC 1080 

TAGAAGGGCC GGAACTGACC ATCCAGAATA CCACAGCCCA CAGCCTCCTG GTGGCCTGGC 1140 

GGAGTGTGCC AGGTGCCACT GGCTACCGTG TGACATGGCG GGTCCTCAGT GGTGGGCCCA 12 00 

70 CACAGCAGCA GGAGCTGGGC CCTGGGCAGG GTTCAGTGTT GCTGCGTGAC TTGGAGCCTG 12 60 

GCACGGACTA TGAGGTGACC GTGAGCACCC TATTTGGCCG CAGTGTGGGG CCCGCCACTT 1320 

CCCTGATGGC TCGCACTGAC GCTTCTGTTG AGCAGACCCT GCGCCCGGTC ATCCTGGGCC 13 80 

CCACATCCAT CCTCCTTTCC TGGAACTTGG TGCCTGAGGC CCGTGGCTAC CGGTTGGAAT 1440 

GGCGGCGTGA GACTGGCTTG GAGCCACCGC AGAAGGTGGT ACTGCCCTCT GATGTGACCC 1500 

75 GCTACCAGTT GGATGGGCTG CAGCCGGGCA CTGAGTACCG CCTCACACTC TACACTCTGC 1560 

TGGAGGGCCA CGAGGTGGCC ACCCCTGCAA CCGTGGTTCC CACTGGACCA GAGCTGCCTG 1620 

TGAGCCCTGT AACAGACCTG CAAGCCACCG AGCTGCCCGG GCAGCGGGTG CGAGTGTCCT 1680 

GGAGCCCAGT CCCTGGTGCC ACCCAGTACC GCAT CATTGT GCGCAGCACC CAGGGGGTTG 1740 

AGCGGACCCT GGTGCTTCCT GGGAGTCAGA CAGCATTCGA CTTGGATGAC GTTCAGGCTG 1800 

80 GGCTTAGCTA CACTGTGCGG GTGTCTGCTC GAGTGGGTCC CCGTGAGGGC AGTGCCAGTG 1860 

TCCTCACTGT CCGCCGGGAG CCGGAAACTC CACTTGCTGT TCCAGGGCTG CGGGTTGTGG 192 0 

TGTCAGATGC AACGCGAGTG AGGGTGGCCT GGGGACCCGT CCCTGGAGCC AGTGGATTTC 1980 

GGATTAGCTG GAG CACAGG C AGTGGTCCGG AGTCCAGCCA GACACTGCCC CCAGACTCTA 2 040 

CTGCCACAGA CATCACAGGG CTGCAGCCTG GAACCACCTA CCAGGTGGCT GTGTCGGTAC 2100 

85 TGCGAGGCAG AGAGGAGGGC CCTGCTGCAG TCATCGTGGC TCGAACGGAC CCACTGGGCC 2160 

CAGTGAGGAC GGTCCATGTG ACTCAGGCCA GCAGCTCATC TGTCACCATT ACCTGGACCA 2220 

GGGTTCCTGG CGCCACAGGA TACAGGGTTT CCTGGCACTC AGCCCACGGC CCAGAGAAAT 2280 
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CCCAGTTGGT TTCTGGGGAG GCCACGGTGG CTGAGCTGGA TGGACTGGAG CCAGATACTG 2340 

AGTATACGGT GCATGTGAGG GCCCATGTGG CTGGCGTGGA TGGGCCCCCT GCCTCTGTGG 2400 

TTGTGAGGAC TGCCCCTGAG CCTGTGGGTC GTGTGTCGAG GCTGCAGATC CTCAATGCTT 2460 

CCAGCGACGT TCTACGGATC ACCTGGGTAG GGGTCACTGG AGCCACAGCT TACAGACTGG 2520 

5 CCTGGGGCCG GAGTGAAGGC GGCCCCATGA GGCACCAGAT ACTCCCAGGA AACACAGACT 2580 

CTGCAGAGAT CCGGGGTCTC GAAGGTGGAG TCAGCTACTC AGTGCGAGTG ACTGCACTTG 2640 

TCGGGGACCG CGAGGGCACA CCTGTCTCCA TTGTTGTCAC TACGCCGCCT GAGGCTCCGC 2700 

CAGCCCTGGG GACGCTTCAC GTGGTGCAGC GCGGGGAGCA CTCGCTGAGG CTGCGCTGGG 2760 

AGCCGGTGCC CAGAGCG CAG GGCTTCCTTC TGCACTGGCA ACCTGAGGGT GGCCAGGAAC 282 0 

10 AGTCCCGGGT CCTGGGGCCG GAGCTCAGCA GCTATCACCT GGACGGGCTG GAGCCAGCGA 2880 

CACAGTACCG CGTGAGGCTG AGTGTCCTAG GGCCGGCTGG AGAAGGGCCC TCTGCAGAGG 2940 

TGACTGCGCG CACTGAGTCA CCTCGTGTTC CAAGCATTGA ACTACGTGTG GTGGACACCT 3000 

CGAT CGACTC GGTGACTTTG GCCTGGACTC CAGTGTCCAG GGCATCCAGC TACATCCTAT 3060 

CCTGGCGGCC ACTCAGAGGC CCTGGCCAGG AAGTGCCTGG GTCCCCGCAG ACACTTCCAG 3120 

15 GGATCTCAAG CTCCCAGCGG GTGACAGGGC TAGAGCCTGG CGTCTCT T AC ATCTTCTCCC 3180 

TGACGCCTGT CCTGGATGGT GTGCGGGGTC CTGAGGCATC TGT CACACAG ACGCCAGTGT 3240 

GCCCCCGTGG CCTGGCGGAT GTGGTGTTCC TACCACATGC CACTCAAGAC AATGCTCACC 3300 

GTGCGGAGGC TACGAGGAGG GTCCTGGAGC GTCTGGTGTT GGCACTTGGG CCTCTTGGGC 3360 

CACAGGCAGT TCAGGTTGGC CTGCTGTCTT ACAGTCATCG GCCCTCCCCA CTGTTCCCAC 3420 

20 TGAATGGCTC CCATGACCTT GGCATTATCT TGCAAAGGAT CCGTGACATG CCCTACATGG 3480 

ACCCAAGTGG GAACAACCTG GGCACAGCCG TGGTCACAGC TCACAGATAC ATGTTGGCAC 3540 

CAGATGCTCC TGGGCGCCGC CAGCACGTAC CAGGGGTGAT GGTTCTGCTA GTGGATGAAC 3 60 0 

CCTTGAGAGG TGACATATTC AGCCCCATCC GTGAGGCCCA GGCTTCTGGG CTTAATGTGG 3 660 

TGATGTTGGG AATGGCTGGA GCGGACCCAG AGCAGCTGCG TCGCTTGGCG CCGGGTATGG 372 0 

25 ACTCTGTCCA GACCTTCTTC GCCGTGGATG ATGGGCCAAG CCTGGACCAG GCAGTCAGTG 3780 

GTCTGGCCAC AGCCCTGTGT CAGGCATCCT TCACTACTCA GCCCCGGCCA GAGCCCTGCC 3840 

CAGTGTATTG TCCAAAGGGC CAGAAGGGGG AACCTGGAGA GATGGGCCTG AGAGGACAAG 3 900 

TTGGGCCTCC TGGCGACCCT GGCCTCCCGG GCAGGACCGG TGCTCCCGGC CCCCAGGGGC 3 960 

CCCCTGGAAG TGCCACTGCC AAGGGCGAGA GGGGCTTCCC TGGAGCAGAT GGGCGTCCAG 4020 

30 GCAGCCCTGG CCGCGCCGGG AATCCTGGGA CCCCTGGAGC CCCTGGCCTA AAGGGCTCTC 4080 

CAGGGTTGCC TGGCCCTCGT GGGGACCCGG GAGAGCGAGG ACCTCGAGGC CCAAAGGGGG 414 0 

AGCCGGGGGC TCCCGGACAA GT CAT CGG AG GTGAAGGACC TGGGCTTCCT GGGCGGAAAG 4200 

GGGACCCTGG ACCATCGGGC CCCCCTGGAC CTCGTGGACC ACTGGGGGAC CCAGGACCCC 42 60 

GTGGCCCCCC AGGGCTTCCT GGAACAGCCA TGAAGGGTGA CAAAGGCGAT CGTGGGGAGC 4320 

35 GGGGTCCCCC TGGACCAGGT GAAGGTGGCA TTGCTCCTGG GGAGCCTGGG CTGCCGGGTC 4380 

TTCCCGGAAG CCCTGGACCC CAAGGCCCCG TTGGCCCCCC TGGAAAGAAA GGAGAAAAAG 444 0 

GTGACTCTGA GGATGGAGCT CCAGGCCTCC CAGGACAACC TGGGTCTCCG GGTGAGCAGG 4500 

GCCCACGGGG ACCTCCTGGA GCTATTGGCC CCAAAGGTGA CCGGGGCTTT CCAGGGCCCC 4560 

TGGGTGAGGC TGGAGAGAAG GGCGAACGTG GACCCCCAGG CCCAGCGGGA TCCCGGGGGC 462 0 

40 TGCCAGGGGT TGCTGGACGT CCTGGAGCCA AGGGTCCTGA AGGGCCACCA GGACCCACTG 4680 

GCCGCCAAGG AGAGAAGGGG GAGCCTGGTC GCCCTGGGGA CCCTGCAGTG GTGGGACCTG 4740 

CTGTTGCTGG AC C CAAAG G A GAAAAGGGAG ATGTGGGGCC CGCTGGGCCC AG AG GAG C T A 480 0 

CCGGAGTCCA AGGGGAACGG GGCCCACCCG GCTTGGTTCT TCCTGGAGAC CCTGGCCCCA 4860 

AGGGAGACCC TGGAGACCGG GGTCCCATTG GCCTTACTGG CAGAGCAGGA CCCCCAGGTG 4 920 

45 ACTCAGGGCC TCCTGGAGAG AAGGGAGACC CTGGGCGGCC TGGCCCCCCA GGACCTGTTG 498 0 

GCCCCCGAGG ACGAGATGGT GAAGTTGGAG AGAAAGGTGA CGAGGGTCCT CCGGGTGACC 5040 

CGGGTTTGCC TGGAAAAGCA GGCGAGCGTG GCCTTCGGGG GGCACCTGGA GTTCGGGGGC 5100 

CTGTGGGTGA AAAGGGAGAC CAGGGAGATC CTGGAGAGGA TGGACGAAAT GGCAGCCCTG 5160 

GAT CAT CTGG ACCCAAGGGT GACCGTGGGG AGCCGGGTCC CCCAGGACCC CCGGGACGGC 5220 

50 TGGTAGACAC AGGACCTGGA GCCAGAGAGA AGGGAGAGCC TGGGGACCGC GGACAAGAGG 5280 

GTCCTCGAGG GCCCAAGGGT GATCCTGGCC TCCCTGGAGC CCCTGGGGAA AGGGGCATTG 5340 

AAGGGTTTCG GGGACCCCCA GGCCCACAGG GGGACCCAGG TGTCCGAGGC CCAGCAGGAG 5400 

AAAAGGGTGA CCGGGGTCCC CCTGGGCTGG ATGGCCGGAG CGGACTGGAT GGGAAACCAG 5460 

GAGCCGCTGG GCCCTCTGGG CCGAATGGTG CTGCAGGCAA AGCTGGGGAC CCAGGGAGAG 552 0 

55 ACGGGCTTCC AGGCCTCCGT GGAGAACAAG GCCTCCCTGG CCCCTCTGGT CCCCCTGGAT 5580 

TACCGGGAAA GCCAGGCGAG GATGGGAAAC CTGGCCTGAA TGGAAAAAAC GGAGAACCTG 5640 

GGGACCCTGG AGAAGACGGG AGGAAGGGAG AGAAAGGAGA TTCAGGCGCC TCTGGGAGAG 570 0 

AAGGTCGTGA TGGCCCCAAG GGTGAGCGTG GAGCTCCTGG TATCCTTGGA CCCCAGGGGC 5760 

CTCCAGGCCT CCCAGGGCCA GTGGGCCCTC CTGGCCAGGG TTTTCCTGGT GT CCCAGGAG 5820 

60 GCACGGGCCC CAAGGGTGAC CGTGGGGAGA CTGGAT CCAA AGGGGAGCAG GGCCTCCCTG 5880 

GAGAGCGTGG CCTGCGAGGA GAGCCTGGAA GTGTGCCGAA TGTGGATCGG TTGCTGGAAA 5940 

CTGCTGGCAT CAAGGCATCT GCCCTGCGGG AGATCGTGGA GACCTGGGAT GAGAGCTCTG 600 0 

GTAGCTTCCT GCCTGTGCCC GAACGGCGTC GAGGCCCCAA GGGGGACTCA GGCGAACAGG 6060 

GCCCCCCAGG CAAGGAGGGC CCCATCGGCT TTCCTGGAGA ACGCGGGCTG AAGGGCGACC 6120 

65 GTGGAGACCC TGGCCCTCAG GGGCCACCTG GTCTGGCCCT TGGGGAGAGG GGCCCCCCCG 6180 

GGCCTTCCGG CCTTGCCGGG GAGCCTGGAA AGCCTGGTAT TCCCGGGCTC CCAGGCAGGG 6240 

CTGGGGGTGT GGGAGAGGCA GGAAGGCCAG GAGAGAGGGG AGAACGGGGA GAGAAAGGAG 630 0 

AACGTGGAGA ACAGGGCAGA GATGGCCCTC CTGGACTCCC TGGAACCCCT GGGCCCCCCG 6360 

GACCCCCTGG CCCCAAGGTG TCTGTGGATG AGCCAGGTCC TGGACTCTCT GGAGAACAGG 6420 

70 GACCCCCTGG ACTCAAGGGT GCTAAGGGGG AGCCGGGCAG CAATGGTGAC CAAGGTCCCA 6480 

AAGGAGACAG GGGTGTGCCA GGCATCAAAG GAGACCGGGG AGAGCCTGGA CCGAGGGGTC 654 0 

AGGACGGCAA CCCGGGTCTA CCAGGAGAGC GTGGTATGGC TGGGCCTGAA GGGAAGCCGG 6600 

GTCTGCAGGG TCCAAGAGGC CCCCCTGGCC CAGTGGGTGG TCATGGAGAC CCTGGACCAG 6660 

CTGGTGCCCC GGGTCTTGCT GGCCCTGCAG GACCCCAAGG ACCTTCTGGC CTGAAGGGGG 6720 

75 AGCCTGGAGA GACAGGACCT CCAGGACGGG GCCTGACTGG ACCTACTGGA GCTGTGGGAC 6780 

TTCCTGGACC CCCCGGCCCT TCAGGCCTTG TGGGT CCACA GGGGTCTCCA GGTTTGCCTG 6840 

GACAAGTGGG GGAGACAGGG AAGCCGGGAG CCCCAGGTCG AGATGGTGCC AGTGGAAAAG 6900 

ATGGAGACAG AGGGAGCCCT GGTGTGCCAG GGTCACCAGG TCTGCCTGGC CCTGTCGGAC 6960 

CTAAAGGAGA ACCTGGCCCC ACGGGGGCCC CTGGACAGGC TGTGGTCGGG CTCCCTGGAG 702 0 

80 CAAAGGGAGA GAAGGGAGCC CCTGGAGGCC TTGCTGGAGA CCTGGTGGGT GAGCCGGGAG 7080 

CCAAAGGTGA CCGAGGACTG CCAGGGCCGC GAGGCGAGAA GGGTGAAGCT GGCCGTGCAG 7140 

GGGAGCCCGG AGACCCTGGG GAAGATGGTC AGAAAGGGGC TCCAGGACCC AAAGGTTTCA 7200 

AGGGTGACCC AGGAGT CGGG GTCCCGGGCT CCCCTGGGCC TCCTGGCCCT CCAGGTGTGA 72 60 

AGGGAGATCT GGGCCTCCCT GGCCTGCCCG GTGCTCCTGG TGTTGTTGGG TTCCCGGGTC 7320 

85 AGACAGGCCC TCGAGGAGAG ATGGGTCAGC CAGGCCCTAG TGGAGAGCGG GGTCTGGCAG 73 80 

GCCCCCCAGG GAGAGAAGGA ATCCCAGGAC CCCTGGGGCC ACCTGGACCA CCGGGGTCAG 7440 

TGGGACCACC TGGGGCCTCT GGACTCAAAG GAGACAAGGG AGACCCTGGA GTAGGGCTGC 7500 
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CTGGGCCCCG AGGCGAGCGT GGGGAGCCAG GCATCCGGGG TGAAGATGGC CGCCCCGGCC 7560 

AGGAGGGACC CCGAGGACTC ACGGGGCCCC CTGGCAGCAG GGGAGAGCGT GGGGAGAAGG 7620 

GTGATGTTGG GAGTGCAGGA CTAAAGGGTG ACAAGGGAGA CTCAGCTGTG ATCCTGGGGC 7680 

CTCCAGGCCC ACGGGGTGCC AAGGGGGACA TGGGTGAACG AGGGCCTCGG GGCTTGGATG 7740 

GTGACAAAGG ACCTCGGGGA GACAATGGGG ACCCTGGTGA CAAGGGCAGC AAGGGAGAGC 7800 

CTGGTGACAA GGGCTCAGCC GGGTTGCCAG GACTGCGTGG ACTCCTGGGA CCCCAGGGTC 7860 

AACCTGGTGC AGCAGGGATC CCTGGTGACC CGGGATCCCC AGGAAAGGAT GGAGTGCCTG 7 920 

GTATCCGAGG AGAAAAAGGA GATGTTGGCT TCATGGGTCC CCGGGGCCTC AAGGGTGAAC 7980 

GGGGAGTGAA GGGAGCCTGT GGCCTTGATG GAGAGAAGGG AGACAAGGGA GAAGCTGGTC 8040 

CCCCAGGCCG CCCCGGGCTG GCAGGACACA AAGGAGAGAT GGGGGAGCCT GGTGTGCCGG 8100 

GCCAGTCGGG GGCCCCTGGC AAGGAGGGCC TGATCGGTCC CAAGGGTGAC CGAGGCTTTG 8160 

ACGGGCAGCC AGGCCCCAAG GGTGACCAGG GCGAGAAAGG GGAGCGGGGA ACCCCAGGAA 8220 

TTGGGGGCTT CCCAGGCCCC AGTGGAAATG ATGGCTCTGC TGGTCCCCCA GGGCCACCTG 8280 

GCAGTGTTGG TCCCAGAGGC CCCGAAGGAC TTCAGGGCCA GAAGGGTGAG CGAGGTCCCC 8340 

CCGGAGAGAG AGTGGTGGGG GCTCCTGGGG TCCCTGGAGC TCCTGGCGAG AGAGGGGAGC 8400 

AGGGGCGGCC AGGGCCTGCC GGTCCTCGAG GCGAGAAGGG AGAAGCTGCA CTGACGGAGG 8460 

ATGACATCCG GGGCTTTGTG CGCCAAGAGA TGAGT CAGCA CTGTGCCTGG CAGGGCCAGT 8520 

TCATCGCATC TGGATCACGA CCCCTCCCTA GTTATGCTGC AGACACTGCC GGCTCCCAGC 8580 

TCCATGCTGT GCCTGTGCTC CGCGTCTCTC ATGCAGAGGA GGAAGAGCGG GTACCCCCTG 8640 

AGGATGATGA GTACTCTGAA TACTCCGAGT ATTCTGTGGA GGAGTACCAG GACCCTGAAG 8700 

CTCCTTGGGA TAGTGATGAC CCCTGTTCCC TGCCACTGGA TGAGGGCTCC TGCACTGCCT 8760 

ACACCCTGCG CTGGTACCAT CGGGCTGTGA CAGGCAGCAC AGAGGCCTGT CACCCTTTTG 882 0 

TCTATGGTGG CTGTGGAGGG AATGCCAACC GTTTTGGGAC CCGTGAGGCC TGCGAGCGCC 8 880 

GCTGCCCACC CCGGGTGGTC CAGAGCCAGG GGACAGGTAC TGCCCAGGAC TGAGGCCCAG 8940 

ATAATGAGCT GAG ATT CAGC ATCCCCTGGA GGAGTCGGGG TCTCAGCAGA ACCCCACTGT 9000 

CCCTCCCCTT GGTGCTAGAG GCTTGTGTGC ACGTGAGCGT GCGAGTGCAC GTCCGTTATT 9060 

TCAGTGACTT GGTCCCGTGG GTCTAGCCTT CCCCCCTGTG GACAAACCCC CATTGTGGCT 9120 

CCTGCCACCC TGGCAGATGA CTCACTGTGG GGGGGTGGCT GTGGGCAGTG AGCGGATGTG 9180 

ACTGGCGTCT GACCCGCCCC TTGACCCAAG CCTGTGATGA CATGGTGCTG ATTCTGGGGG 9240 
GCATTAAAGC TGCTGTTTTA AAAGGCAAAA AA 

Seq ID NO: 63 Protein sequence: 
Protein Accession #: NP_ 000085.1 

1 11 21 31 41 51 

I I I I I I 

MTLRLLVAAL CAGI LAEAPR VRAQHRERVT CTRLYAAD IV FLLDGSSSIG RSNFREVRSF 60 

LEGLVIiPFSG AASAQGVRFA TVQYSDDPRT EFGLDALGSG GDVIRAIREL SYKGGNTRTG 120 

AAILHVADHV FLPQLARPGV PKVCILITDG KSQDLVDTAA QRLKGQGVKL FAVGIKNADP 180 

EELKRVASQP TSDFFFFVND FSILRTIiLPL VSRRVCTTAG GVPVTRPPDD STSAPRDLVL 240 

SEPSSQSLRV QWTAASGPVT GYKVQYTPLT GLGQPLPSER QEVNVPAGET SVRLRGLRPL 300 

TEYQVTVIAL YANSIGEAVS GTARTTAL EG PELTIQNTTA HSLLVAWRSV PGATGYRVTW 360 

RVLSGGPTQQ QELGPGQGSV IiLRDLEPGTD' YEVTVSTLFG RSVGPATSLM ARTDASVEQT 420 

LRPVILGPTS ILLSWNLVPE ARGYRLEWRR ETGLEPPQKV VLPSDVTRYQ LDGLQPGTEY 480 

RLTLYTLLEG HEVATPATW PTGPELPVSP VTDLQATELP GQRVRVSWSP VPGATQYRII 54 0 

VRSTQGVERT LVLPGSQTAF DLDDVQAGLS YTVRVSARVG PREGSASVI/T VRREPETPLA 600 

VPGLRVWSD ATRVRVAWGP VPGASGFRIS WSTGSGPESS QTLPPDSTAT DITGLQPGTT 660 

YQVAVSVLRG REEGPAAVIV ARTDPLGPVR TVHVTQASSS SVTITWTRVP GATGYRVSWH 720 

SAHGPEKSQIi VSGEATVAEL DGLEPDTEYT VHVRAHVAGV DGPPASVWR TAPEPVGRVS 780 

RLQILNASSD VLRITWVGVT GATAYRLAWG RSEGGPMRHQ I LPGNTDSAE IRGLEGGVSY 840 

SVRVTAL VGD REGTPVSIW TTPPEAPPAL GTLHWQRGE HSLRLRWEPV PRAQGFLLHW 900 

QPEGGQEQSR VLGPELSSYH LDGLEPATQY RVRLSVLGPA GEGPSAEVTA RTESPRVPSI 960 

ELRWDTSID SVTLAWTPVS RASSYILSWR PLRGPGQEVP GSPQTLPGIS SSQRVTGLEP 1020 

GVSYIFSLTP VLDGVRGPEA SVTQTPVCPR GLADWFLPH ATQDNAHRAE ATRRVLERLV 1080 

LALGPLGPQA VQVGLLSYSH RPSPLFPLNG SHDLGIILQR IRDMPYMDPS GNNLGTAWT 1140 

AHRYMLAPDA PGRRQHVPGV MVLLVDEPLR GDIFSPIREA QASGLNWML GMAGADPEQL 1200 

RRLAPGMDSV QTFFAVDDGP SLDQAVSGLA TALCQASFTT QPRPEPCPVY CPKGQKGEPG 1260 

EMGLRGQVGP PGDPGLPGRT GAPGPQGPPG SATAKGERGF PGADGRPGSP GRAGNPGTPG 1320 

APGLKGSPGL PGPRGDPGER GPRGPKGEPG APGQVIGGEG PGLPGRKGDP GPSGPPGPRG 1380 

PLGDPGPRGP PGLPGTAMKG DKGDRGERGP PGPGEGGIAP GEPGLPGLPG SPGPQGPVGP 1440 

PGKKGEKGDS EDGAPGLPGQ PGSPGEQGPR GPPGAIGPKG DRGFPGPLGE AGEKGERGPP 150 0 

GPAGSRGLPG VAGRPGAKGP EGPPGPTGRQ GEKGEPGRPG DPAWGPAVA GPKGEKGDVG 1560 

PAGPRGATGV QGERGPPGLV LPGDPGPKGD PGDRGPIGLT GRAGPPGDSG PPGEKGDPGR 1620 

PGPPGPVGPR GRDGEVGEKG DEGPPGDPGL PGKAGERGLR GAPGVRGPVG EKGDQGDPGE 1680 

DGRNGSPGSS GPKGDRGEPG PPGPPGRIjVD TGPGAREKGE PGDRGQEGPR GPKGDPGLPG 1740 

APGERGIEGF RGPPGPQGDP GVRGPAGEKG DRGPPGLDGR SGLDGKPGAA GPSGPNGAAG 180 0 

KAGDPGRDGL PGLRGEQGLP GPSGPPGLPG KPGEDGKPGL NGKNGEPGDP GEDGRKGEKG 1860 

DSGASGREGR DGPKGERGAP GILGPQGPPG LPGPVGPPGQ GFPGVPGGTG PKGDRGETGS 1920 

KGEQGIiPGER GIiRGEPGSVP NVDRLLETAG IKASAIjREIV ETWDESSGSF LPVPERRRGP 1980 

KGDSGEQGPP GKEGPIGFPG ERGLKGDRGD PGPQGPPGLA LGERGPPGPS GLAGEPGKPG 2040 

IPGLPGRAGG VGEAGRPGER GERGEKGERG EQGRDGPPGL PGTPGPPGPP GPKVSVDEPG 2100 

PGLSGEQGPP GLKGAKGEPG SNGDQGPKGD RGVPGIKGDR GEPGPRGQDG NPGLPGERGM 2160 

AGPEGKPGLQ GPRGPPGPVG GHGDPGPPGA PGLAGPAGPQ GPSGLKGEPG ETGPPGRGLT 2220 

GPTGAVGLPG PPGPSGLVGP QGSPGLPGQV GETGKPGAPG RDGASGKDGD RGSPGVPGSP 2280 

GLPGPVGPKG EPGPTGAPGQ AWGLPGAKG EKGAPGGLAG DLVGEPGAKG DRGLPGPRGE 2340 

KGEAGRAGEP GDPGEDGQKG APGPKGFKGD PGVGVPGSPG PPGPPGVKGD LGLPGLPGAP 2400 

GWGFPGQTG PRGEMGQPGP SGERGLAGPP GREGIPGPLG PPGPPGSVGP PGASGLKGDK 2460 

GDPGVGI/PGP RGERGEPGIR GEDGRPGQEG PRGLTGPPGS RGERGEKGDV GSAGLKGDKG 2520 

DSAVILGPPG PRGAKGDMGE RGPRGLDGDK GPRGDNGDPG DKGSKGEPGD KG S AGLPGLR 2580 

GLLGPQGQPG AAGI PGDPGS PGKDGVPGIR GEKGDVGFMG PRGLKGERGV KGACGLDGEK 2640 

GDKGEAGPPG RPGLAGHKGE MGEPGVPGQS GAPGKEGLIG PKGDRGFDGQ PGPKGDQGEK 2700 

GERGTPGIGG FPGPSGNDGS AGPPGPPGSV GPRGPEGLQG QKGERGPPGE RWGAPGVPG 2760 

APGERGEQGR PGPAGPRGEK GEAALTEDD I RGFVRQEMSQ HCACQGQFIA SGSRPLPSYA 2820 

ADTAGSQLHA VPVLRVSHAE EEERVPPEDD EYSEYSEYSV EEYQDPEAPW DSDDPCSLPL 2880 

DEGSCTAYTL RWYHRAVTGS TEACHPFVYG GCGGNANRFG TREACERRCP PRWQSQGTG 2940 
TAQD 
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Seq ID NO: 64 DNA sequence 

Nucleic Acid Accession #: NM_006945 

Coding sequence: 1-219 

1 11 21 31 41 51 

I I I I 1 I 

ATGTCTTATC AACAGCAGCA GTGCAAGCAG CCCTGCCAGC CACCTCCTGT GTGCCCCACG 60 

CCAAAGTGCC CAGAGCCATG TCCACCCCCG AAGTGCCCTG AGCCCTGCCC ACCACCAAAG 120 

TGT CCACAGC CCTGCCCACC TCAGCAGTGC CAGCAGAAAT ATCCTCCTGT GACACCTTCC 180 
CCACCCTGCC AGCCAAAGTA TCCACCGAAG AGCAAGTAA 

Seq ID NO: 65 Protein sequence: 
Protein Accession #: NP_008876 

1 11 21 31 41 51 

1)111! 

MSYQQQQCKQ PCQPPPVCPT PKCPEPCPPP KCPEPCPPPK CPQPCPPQQC QQKYPPVTPS 60 
PPCQPKYPPK SK 

Seq ID NO: 66 DNA sequence 

Nucleic Acid Accession #: NM_005629.1 

Coding sequence: 63 9-2 546 

1 11 21 31 41 51 

I I I I I I 

TAGTCGGAGC GAGGTGGCGA GTCGCTGAGC CCGCCGCGGC CCCGAGAGCG GCTGCAGCCG 60 

CCGCCGCCGG GAAGGAGAGG GCGAGGCGCG CCCGAGCCGC CGCCGCCGCC GCCACCGCCG 120 

CCGCCGCCAC CACCGCCACC GGAGTCGCGG GCCAGCCGGG CAGCCTCCGC GGGCCCCGGC 180 

CGGGGCGGGG GGCGCGGGCC ACAGGCCCCT GCTCCGGCCG TCGTTTGCAG ACCGCGGGCG 240 

CCGATGTCGC CCGCGCCCCG TTAGGATGAG TCTCGGGTCG GGCGAGGAGC CGCCGCAGCC 300 

GCCGCCGCCC GAGCCGCGGG CAGGAGCCTC GGGAGCCGCC GCCGCCGCCG CCGCCGCCCG 360 

GCCGGGCCCC GACGCCGCCC GCGCGCCCCC GGGCCCCCGA CACACATGAG ATTCTTCAGG 420 

CTCACTTTCA AGTGCTTCGT GGACTGCTTC TGACTGCGCC GCCCGCGCCC CGCACCCCGC 480 

CGTCCGCCCG CCGCCCCGTC CCCCGGCCCG GCCGCCCCCC GGCCCCCGGC CGGCCCGCGC 540 

CCTCGGGGCC CTCCCCGGTG CCGCCGGTGC CCCCCGCCTG ACCGCCGCCC CCCGTGAGGC 60 0 

GCCGCGACCC CGGCCCGGCC GTGCGGCCCG CCGGGGCCAT GGCGAAGAAG AGCGCCGAGA 660 

ACGGCATCTA TAGCGTGTCC GGCGACGAGA AGAAGGGCCC CCTCATCGCG CCCGGGCCCG 72 0 

ACGGGGCCCC GGCCAAGGGC GACGGCCCCG TGGGCCTGGG GACACCCGGC GGCCGCCTGG 780 

CCGTGCCGCC GCGCGAGACC TGGACGCGCC AGATGGACTT CATCATGTCG TGCGTGGGCT 840 

TCGCCGTGGG CTTGGGCAAC GTGTGGCGCT TCCCCTACCT GTGCTACAAG AACGGCGGAG 900 

GTGTGTTCCT TATTCCCTAC GTCCTGATCG CCCTGGTTGG AGGAAT CCCC ATTTTCTTCT 960 

TAG AG AT C T C GCTGGGCCAG TTCATGAAGG CCGGCAGCAT CAATGTCTGG AACATCTGTC 1020 

CCCTGTTCAA AGGCCTGGGC TACGCCTCCA TGGTGATCGT CTTCTACTGC AACACCTACT 1080 

ACAT CAT G GT GCTGGCCTGG GGCTTCTATT ACCTGGT CAA GTCCTTTACC ACCACGCTGC 1140 

CCTGGGCCAC ATGTGGCCAC ACCTGGAACA CTCCCGACTG CGTGGAG AT C TTCCGCCATG 1200 

AAGACTGTGC CAATGCCAGC CTGGCCAACC TCACCTGTGA CCAGCTTGCT GACCGCCGGT 1260 

CCCCTGTCAT CGAGTTCTGG GAGAACAAAG TCTTGAGGCT GTCTGGGGGA CTGGAGGTGC 1320 

CAGGGGCCCT CAACTGGGAG GTGACCCTTT GTCTGCTGGC CTGCTGGGTG CTGGTCTACT 1380 

TCTGTGTCTG GAAGGGGGTC AAAT CCACGG GAAAG AT CGT GTACTTCACT GCTACATTCC 1440 

CCTACGTGGT CCTGGTCGTG CTGCTGGTGC GTGGAGTGCT GCTGCCTGGC GCCCTGGATG 1500 

GCATCATTTA CTATCTCAAG CCTGACTGGT CAAAGCTGGG GTCCCCTCAG GTGTGGATAG 1560 

ATGCGGGGAC CCAGATTTTC TTTTCTTACG CCAT TGGCCT GGGGGCCCTC ACAGCCCTGG 162 0 

GCAGCTACAA CCGCTTCAAC AACAACTGCT ACAAGGACGC CATCATCCTG GCTCTCATCA 1680 

ACAGTGGGAC CAGCTTCTTT GCTGGCTTCG TGGTCTTCTC CATCCTGGGC TTCATGGCTG 1740 

CAGAGCAGGG CGTGCACATC TCCAAGGTGG CAGAGTCAGG GCCGGGCCTG GCCTTCATCG 1800 

CCTACCCGCG GGCTGTCACG CTGATGCCAG TGGCCCCACT CTGGGCTGCC CTGTTCTTCT 1860 

TCATGCTGTT GCTGCTTGGT CTCGACAGCC AGTTTGTAGG TGTGGAGGGC TTCATCACCG 1920 

GCCTCCTCGA CCTCCTCCCG GCCTCCTACT ACTTCCGTTT CCAAAGGGAG ATCTCTGTGG 1980 

CCCTCTGTTG TGCCCTCTGC TTTGTCATCG ATCTCTCCAT GGTGACTGAT GGCGGGATGT 2 040 

ACGTCTTCCA GCTGTTTGAC TACT ACT CGG CCAGCGGCAC CACCCTGCTC TGGCAGGCCT 210 0 

TTTGGGAGTG CGTGGTGGTG GCCTGGGTGT ACGGAGCTGA CCGCTTCATG GACGACATTG 2160 

CCTGTATGAT CGGGTACCGA CCTTGCCCCT GGATGAAATG GTGCTGGTCC TTCTTCACCC 222 0 

CGCTGGTCTG CATGGGCAT C TTCATCTTCA ACGTTGTGTA CTACGAGCCG CTGGTCTACA 22 80 

ACAACACCTA CGTGTACCCG TGGTGGGGTG AGGCCATGGG CTGGGCCTTC GCCCTGTCCT 2340 

CCATGCTGTG CGTGCCGCTG CACCTCCTGG GCTGCCTCCT CAGGGCCAAG GGCACCATGG 2400 

CTGAGCGCTG GCAGCACCTG ACCCAGCCCA TCTGGGGCCT CCACCACTTG GAGTACCGAG 2460 

CTCAGGACGC AGATGTCAGG GGCCTGACCA CCCTGACCCC AGTGT CCGAG AGCAGCAAGG 2520 

TCGTCGTGGT GG AG AGTGT C ATGTGACAAC TCAGCTCACA TCACCAGCTC ACCTCTGGTA 2580 

GCCATAGCAG CCCCTGCTTC AGCCCCACCG CACCCCTCCA GGGGGCCTGC CTTTCCCTGA 2640 

CACTTTTGGG GTCTGCCTGG GGGAGGAGGG GAGAAAGCAC CATGAGTGCT CACTAAAACA 2700 

ACTTTTTCCA TTTTTAATAA AACGCCAAAA ATATCACAAC CCACCAAAAA TAGATGCCTC 2760 

TCCCCCTCCA GCCCTAGCCG AGCTGGTCCT AGGCCCCGCC TAGTGCCCCA CCCCCACCCA 2 820 

CAGTGCTGCA CTCCTCCTGC CCCTGCCACG CCCACCCCCT GCCCACCTCT CCAGGCTCTG 2 880 

CTCTGCAGCA CACCCGTGGG TGACCCCTCA CCCCAGAAGC AGCAGTGGCA GCTTGGGAAA 2 940 

TGTGAGGAAG GGAAGGAGGG AGAGACGGGA GGGAGGAGAG AGAGGAGAAG GGAGGCAGGG 3 0 00 

GAGGGGCAGC AGAACCAAGG CAAATATTTC AGCTGGGCTA TACCCCTCTC CCCATCCCTG 3060 

TTATAGAAGC TTAGAGAGCC AGCCAGCAAT GGAACCTTCT GGTTCCTGCG CCAATCGCCA 3120 

CCAGTATCAA TTGTGTGAGC TTGGGTGCGA GTGCACGCGT GCGTGAGTAC GGAGAGTATA 3180 

TATAGATCTC TATCTCTTAG CAAAGGTGAA TGCCAGATGT AAATGGCGCC TCTGGGCAAA 3240 

GGAGGCTTGT ATTTTGCACA TTTTATAAAA ACTTGAGAGA ATGAGATTTC TGCTTGTATA 3300 

TTTCTAAAAA GAGGAAGGAG CCCAAACCAT CCTCTCCTTA CCACTCCCAT CCCTGTGAGC 3360 

CCTACCTTAC CCCTCTGCCC CTAGCCAAGG AGTGTGAATT TATAGATCTA ACTTTCATAG 3420 

GCAAAACAAA AGCTTCGAGC TGTTGCGTGT GTGAGTCTGT TGTGTGGATG TGCGTGTGTG 3480 

GTCCCCAGCC CCAGACTGGA TTGGAAAAGT GCATGGTGGG GGCCTCGGGG CTGTCCCCAC 3540 

GCTGTCCCTT TGCCACAAGT CTGTGGGGCA AGAGGCTGCA ATATTCCGTC CTGGGTGTCT 3600 

GGGCTGCTAA CCTGGCCTGC TCAGGCTTCC CACCCTGTGC GGGGCACACC CCCAGGAAGG 3660 

GACCCTGGAC ACGGCTCCCA CGTCCAGGCT TAAGGTGGAT GCACTTCCCG CACCTCCAGT 3720 
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CTTCTGTGTA GCAGCTTTAA CCCACGTTTG TCTGTCACGT CCAGTCCCGA GACGGCTGAG 3780 

TGACCCCAAG AAAGGCTTCC CCGACACCCA GACAGAGGCT GCAGGGCTGG GGCTGGGTGA 3840 

GGGTGGCGGG CCTGCGGGGA CATT CTACTG TGCTAAAAAG CCACTGCAGA CATAGCAATA 3900 
AAAACATGTC ATTTTCC 

Seq ID NO: 67 Protein sequence: 
Protein Accession #: NP_005620.1 

1 11 21 31 41 51 

I I 1 1 I I 

MAKKSAENGI YSVSGDEKKG PL I APGPDGA PAKGDGPVGL GTPGGRLAVP PRETWTRQMD 60 

FIMSCVGFAV GLGNVWRPPY LCYKNGGGVF LIPYVLIALV GGIPIFFLEI SLGQFMKAGS 120 

INVWNICPLF KGLGYASMVI VFYCNTYYIM VLAWGFYYLV KSFTTTLPWA TCGHTWNTPD 180 

CVEIFRHEDC ANASLANLTC DQLADRRSPV IEFWENKVLR DSGGLEVPGA LNWEVTLCLL 240 

ACWVLVYFCV WKGVKSTGKI VYFTATFPYV VLWLLVRGV LLPGALDGI I YYLKPDWSKL 300 

GSPQVWIDAG TQIFFSYAIG LGALTALGSY NRFNNNCYKD AIILALINSG TSFFAGFWF 360 

SILGFMAAEQ GVHISKVAES G P GIiAF I A YP RAVTLMPVAP LWAALFFFML LLLGLDSQFV 42 0 

GVEGFITGLL DLLPASYYFR FQREISVALC CALCFVIDLS MVTDGGMYVF QLFDYYSASG 480 

TTLLWQAFWE CWVAWVYGA DRFMDD I ACM IGYRPCPWMK WCWSFFTPLV CMGIFIFNW 540 

YYEPLVYNNT YVYPWWGEAM GWAFALSSML CVPLHLLGCL LRAKGTMAER WQHLTQPIWG 600 
LHHLEYRAQD ADVRGLTTLT PVSESSKWV VESVM 



Seq ID NO: 68 DNA sequence 

Nucleic Acid Accession #: NM_021953.1 

Coding sequence: 178-2469 

1 '11 21 31 41 51 

] 1 I I I I 

GGCACGAGGG GGACCCGGCC GGTCCGGCGC GAGCCCCCGT CCGGGGCCCT GGCTCGGCCC 60 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG GCGGCGACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG ATT CATAATG 180 

AAAGCTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA 240 

AATGCCCCAA GTGAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 3 00 

AATCAAGCAG AGGCCTCCAA GGAAGTGGCG GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 42 0 

GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTT CCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCGC CAT CAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

AT CTAT ACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 102 0 

AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCGGGA GACGTCTGCC 1080 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACCGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC CCGAGCACTT GGAATCACAG 12 00 

CAGAAACGAC CGAATCCAGA GCTCCGCCGG AACATGACCA T C AAAACCGA ACTCCCCCTG 12 6 0 

GGCGCACGGC GGAAGATGAA GCCACTGCTA CCACGGGTCA GCTCATACCT GGTACCTATC 13 2 0 

CAGTTCCCGG TGAACCAGTC ACTGGTGTTG CAGCCCTCGG TGAAGGTGCC ATTGCCCCTG 1380 

GCGGCTTCCC TCATGAGCTC AGAGCTTGCC CGCCATAGCA AGCGAGTCCG CATTGCCCCC 1440 

AAGGTGCTGC TAGCTGAGGA GGGGATAGCT CCTCTTTCTT CTGCAGGACC AGGGAAAGAG 1500 

GAGAAACTCC TGTTTGGAGA AGGGTTTTCT CCTTTGCTTC CAGTTCAGAC TATCAAGGAG 1560 

GAAGAAATCC AGCCTGGGGA GGAAATGCCA CACTTAGCGA GACCCATCAA AGTGGAGAGC 1620 

CCTCCCTTGG AAGAGTGGCC CTCCCCGGCC CCATCTTTCA AAGAGGAATC ATCTCACTCC 1680 

TGGGAGGATT CGTCCCAATC TCCCACCCCA AGACCCAAGA AGTCCTACAG TGGGCTTAGG 1740 

TCCCCAACCC GGTGTGTCTC GGAAATGCTT GTGATTCAAC ACAGGGAGAG GAGGGAGAGG 1800 

AGCCGGTCTC GGAGGAAACA GCATCTACTG CCTCCCTGTG TGGATGAGCC GGAGCTGCTC 1860 

TTCTCAGAGG GGCCCAGTAC TTCCCGCTGG GCCGCAGAGC TCCCGTTCCC AGCAGACTCC 192 0 

TCTGACCCTG CCTCCCAGCT CAGCTACTCC CAGGAAGTGG GAGGACCTTT TAAGACACCC 1980 

ATTAAGGAAA CGCTGCCCAT CTCCTCCACC CCGAGCAAAT CTGTCCTCCC CAGAACCCCT 2 040 

GAAT CCTGGA GGCTCACGCC CCCAGCCAAA GTAGGGGGAC TGGATTTCAG CCCAGTACAA 210 0 

ACCTCCCAGG GTGCCTCTGA CCCCTTGCCT GACCCCCTGG GGCTGATGGA TCTCAGCACC 2160 

ACTCCCTTGC AAAGTGCTCC CCCCCTTGAA TCACCGCAAA GGCTCCTCAG TTCAGAACCC 2220 

TTAGACCTCA TCTCCGTCCC CTTTGGCAAC TCTTCTCCCT CAGATATAGA CGTCCCCAAG 2280 

CCAGGCTCCC CGGAGCCACA GGTTTCTGGC CTTGCAGCCA ATCGTTCTCT GACAGAAGGC 2340 

CTGGTCCTGG ACACAATGAA TGACAGCCTC AGCAAGATCC TGCTGGACAT CAGCTTTCCT 2400 

GGCCTGGACG AGGACCCACT GGGCCCTGAC AACATCAACT GGTCCCAGTT TATTCCTGAG 2460 

CTACAGTAGA GCCCTGCCCT TGCCCCTGTG CTCAAGCTGT CCACCATCCC GGGCACTCCA 2520 

AGGCTCAGTG CACCCCAAGC CTCTGAGTGA GGACAGCAGG CAGGGACTGT TCTGCTCCTC 2580 

ATAGCTCCCT GCTGCCTGAT T ATG CAAAAG TAGCAGTCAC ACCCTAGCCA CTGCTGGGAC 2 640 

CTTGTGTTCC CCAAGAGTAT CTGATTCCTC TGCTGTCCCT GCCAGGAGCT GAAGGGTGGG 2 700 

AACAACAAAG GCAATGGTGA AAAGAGATTA GGAACCCCCC AGCCTGTTTC CATTCTCTGC 2760 

CCAGCAGTCT CTTACCTTCC CTGATCTTTG CAGGGTGGTC CGTGTAAATA GTATAAATTC 282 0 

TCCAAATTAT CCTCTAATTA TAAATGTAAG CTTATTTCCT TAGATCATTA T CCAGAGACT 2880 

GCCAGAAGGT GGGTAGGATG ACCTGGGGTT TCAATTGACT TCTGTTCCTT GCTTTTAGTT 2 940 

TTGATAGAAG GGAAGACCTG CAGTGCACGG TTTCTTCCAG GCTGAGGTAC CTGGAT CTTG 3000 

GGTTCTTCAC TGCAGGGACC CAGACAAGTG GATCTGCTTG CCAGAGTCCT TTTTGCCCCT 3 0 60 

CCCTGCCACC TCCCCGTGTT TCCAAGTCAG CTTTCCTGCA AGAAGAAATC CTGGTTAAAA 3120 

AAGTCTTTTG TATTGGGTCA GGAGTTGAAT TTGGGGTGGG AGGATGGATG CAACTGAAGC 3180 

AGAGTGTGGG TGCCCAGATG TGCGCTATTA GATGTTTCTC TGATAATGTC CCCAAT CAT A 3240 

CCAGGGAGAC TGGCATTGAC GAGAACTCAG GTGGAGGCTT GAGAAGGCCG AAAGGGCCCC 33 00 

TGACCTGCCT GGCTTCCTTA GCTTGCCCCT CAGCTTTGCA AAGAGCCACC CTAGGCCCCA 3360 

GCTGACCGCA TGGGTGTGAG CCAGCTTGAG AACACTAACT ACTCAATAAA AGCGAAGGTG 3420 
GACCNAAAAA AAAAAAAAAA AAAA 



214 



WO 02/086443 



PCT/US02/12476 



Seq ID NO: 69 Protein sequence: 
Protein Accession #: NP_068772.1 

5 1 11 21 31 41 51 

I 1 I I I I 

MKASPRRPLI LKRRRLPLPV QNAPSETSEE EPKRSPAQQE SNQAEASKEV AESNSCKFPA 60 

GIKIINHPTM PNTQWAIPN NANIHSI ITA LTAKGKESGS SGPNKFILIS CGGAPTQPPG 12 0 

LRPQTQTSYD AKRTEVTLET LGPKPAARDV NLPRPPGALC EQKRETCADG EAAGCTINNS 180 

10 LSNIQWLRKM SSDGLGSRSI KQEMEEKENC HLEQRQVKVE EPSRPSASWQ NSVSERPPYS 240 

YMAMIQFAIN STERKRMTLK DIYTWIEDHF PYFKHIAKPG WKNSIRHNLS LHDMFVRETS 300 

ANGKVS FWT I HPSANRYLTL DQVFKPLDPG SPQLPEHLES QQKRPNPELR RNMTIKTEDP 360 

LGARRKMKPL LPRVSSYLVP IQFPVNQSLV LQPSVKVPLP LAASLMSSEL ARHSKRVRIA 420 

PKVLLAEEGI APLSSAGPGK EEKLLFGEGF SPLLPVQTIK EEEIQPGEEM PHLARPIKVE 480 

15 SPPLEEWPSP APSFKEESSH SWEDSSQSPT PRPKKSYSGL RSPTRCVSEM LVIQHRERRE 540 

RSRSRRKQHL LPPCVDEPEL IiFSEGPSTSR WAAELPFPAD SSDPASQLSY SQEVGGPFKT 600 

PIKETLPISS TPSKSVLPRT PESWRLTPPA KVGGLDFSPV QTSQGASDPL PDPLGLMDDS 660 

TTPLQSAPPL ESPQRLLSSE PLDLISVPFG NSSPSDIDVP KPGSPEPQVS GLAANRSLTE 72 0 
GLVI/DTMNDS LSKILLDISF PGLDEDPLGP DNINWSQFIP ELQ 
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Seq ID NO: 70 DNA sequence 

Nucleic Acid Accession #: BC006529. 

Coding sequence: 178-2424 



1 11 21 31 41 51 

I I I I 1 I 

GGCACGAGGG GGACCCGGCC GGTCCGGCGC GAGCCCCCGT CCGGGGCCCT GGCTCGGCCC 60 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG GCGGCGACTG 120 

30 CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG AT T G ATAATG 180 

AAAACTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA 240 

AATGCCCCAA GTGAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 300 

AATCAAGCAG AGGCCTCCAA GGAAGTGGCA GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 42 0 

35 GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTAT CAA CAATAGCCTA 720 

40 TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCAT CAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCGC CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACT TTAAGC ACATTGCCAA GCCAGGCTGG 1020 

45 AAGAACT CCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCGGGA GACGTCTGCC 1080 

AATGGCAAGG TCTCCTTCTG GACC AT TCAC CCCAGTGCCA ACCGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AG C AGCAG AA ACGACCGAAT CCAGAGCTCC GCCGGAACAT GACCATCAAA 1200 

ACCGAACTCC CCCTGGGCGC ACGGCGGAAG ATGAAGCCAC TGCTACCACG GGTCAGCTCA 1260 

TACCTGGTAC CTATCCAGTT CCCGGTGAAC CAGTCACTGG TGTTGCAGCC CTCGGTGAAG 1320 

50 GTGCCATTGC CCCTGGCGGC TTCCCTCATG AGCTCAGAGC TTGCCCGCCA TAGCAAGCGA 1380 

GTCCGCATTG CCCCCAAGGT GCTGCTAGCT GAGGAGGGGA TAGCTCCTCT TTCTTCTGCA 1440 

GGACCAGGGA AAGAGGAGAA ACTCCTGTTT GGAGAAGGGT TTTCTCCTTT GCTTCCAGTT 1500 

CAGACTATCA AGGAGGAAGA AATCCAGCCT GGGGAGGAAA TGCCACACTT AG CGAGACCC 1560 

ATCAAAGTGG AGAGCCCTCC CTTGGAAGAG TGGCCCTCCC CGGCCCCATC TTTCAAAGAG 1620 

55 G AAT CAT CTC ACTCCTGGGA GGATTCGTCC CAATCTCCCA CCCCAAGACC CAAGAAGTCC 1680 

TACAGTGGGC TTAGGTCCCC AACCCGGTGT GTCTCGGAAA TGCTTGTGAT TCAACACAGG 1740 

GAGAGGAGGG AGAGGAGCCG GTCTCGGAGG AAACAGCATC TACTGCCTCC CTGTGTGGAT 1800 

GAGCCGGAGC TGCTCTTCTC AGAGGGGCCC AGTACTTCCC GCTGGGCCGC AGAGCTCCCG 1860 

TTCCCAGCAG ACTCCTCTGA CCCTGCCTCC CAGCTCAGCT ACTCCCAGGA AGTGGGAGGA 192 0 

60 CCTTTTAAGA CACCCATTAA GGAAACGCTG CCCATCTCCT CCACCCCGAG CAAATCTGTC 1980 

CTCCCCAGAA CCCCTGAATC CTGGAGGCTC ACGCCCCCAG CCAAAGTAGG GGGACTGGAT 2040 

TTCAGCCCAG TACAAACCCC CCAGGGTGCC TCTGACCCCT TGCCTGACCC CCTGGGGCTG 2100 

ATGGATCTCA GCACCACTCC CTTGCAAAGT GCTCCCCCCC TTGAATCACC GCAAAGGCTC 2160 

CTCAGTTCAG AACCCTTAGA CCTCATCTCC GTCCCCTTTG GCAACTCTTC TCCCTCAGAT 222 0 

65 ATAGACGTCC CCAAGCCAGG CTCCCCGGAG CCACAGGTTT CTGGCCTTGC AGCCAATCGT 22 80 

TCTCTGACAG AAGGCCTGGT CCTGGACACA ATGAATGACA GCCTCAGCAA GATCCTGCTG 2340 

GACATCAGCT TTCCTGGCCT GGACGAGGAC CCACTGGGCC CTGACAACAT CAACTGGTCC 240 0 

CAGTTTATTC CTGAGCTACA GTAGAGCCCT GCCCTTGCCC CTGTGCTCAA GCTGTCCACC 2460 

ATCCCGGGCA CTCCAAGGCT CAGTGCACCC CAAGCCTCTG AGTGAGGACA GCAGGCAGGG 2520 

70 ACTGTTCTGC TCCTCATAGC TCCCTGCTGC CTGATTATGC AAAAGTAGCA GTCACACCCT 2580 

AGCCACTGCT GGGACCTTGT GTTCCCCAAG AGTATCTGAT TCCTCTGCTG TCCCTGCCAG 2640 

GAGCTGAAGG GTGGGAACAA CAAAGGCAAT GGTGAAAAGA GATTAGGAAC CCCCCAGCCT 2700 

GTTTCCATTC TCTGCCCAGC AGTCTCTTAC CTTCCCTGAT CTTTGCAGGG TGGTCCGTGT 27 60 

AAATAGTATA AATTCTCCAA ATTATCCTCT AATTATAAAT GTAAGCTTAT TTCCTTAGAT 2 820 

75 CATTAT CCAG AGACTGCCAG AAGGTGGGTA GGATGACCTG GGGTTTCAAT TGACTTCTGT 2880 

TCCTTGCTTT TAGTTTTGAT AGAAGGGAAG ACCTGCAGTG CACGGTTTCT TCCAGGCTGA 2940 

GGTACCTGGA TCTTGGGTTC TTCACTGCAG GGACCCAGAC AAGTGGATCT GCTTGCCAGA 3000 

GTCCTTTTTG CCCCTCCCTG CCACCTCCCC GTGTTTCCAA GTCAGCTTTC CTGCAAGAAG 30 60 

AAATCCTGGT TAAAAAAGTC TTTTGTATTG GGTCAGGAGT TGAATTTGGG GTGGGAGGAT 3120 

80 GGATGCAACT GAAGCAGAGT GTGGGTGCCC AGATGTGCGC TATTAGATGT TTCTCTGATA 3180 

ATGTCCCCAA TCATACCAGG GAGACTGGCA TTGACGAGAA CTCAGGTGGA GGCTTGAGAA 3240 

GGCCGAAAGG GCCCCTGACC TGCCTGGCTT CCTTAGCTTG CCCCTCAGCT TTGCAAAGAG 3300 

CCACCCTAGG CCCCAGCTGA CCGCATGGGT GTGAGCCAGC TTGAGAACAC TAACT ACT CA 3 3 60 
AT AAAAG CGA AGGTGGAAAA AAAAAAAAAA AAAAAAA 
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Seq ID NO: 71 protein sequence: 
Protein Accession #: AAH06529.1 
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1 11 21 31 41 51 

I I I I I I 

MKTSPRRPLI LKRRRLPLPV QNAPSETSEE EPKRSPAQQE SNQAEASKEV AESNSCKFPA 60 

GIKIINHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPNKFILIS CGGAPTQPPG 120 

LRPQTQTSYD AKRTEVTLET LGPKPAARDV NLPRPPGALC EQKRETCADG EAAGCTINNS 180 

LSNIQWLRKM SSDGLGSRSI KQEMEEKENC HLEQRQVKVE EPSRPSASWQ NSVSERPPYS 240 

YMAMIQFAIN STERKRMTLK DIYTWIEDHF PYFKHIAKPG WKNSIRHNLS LHDMFVRETS 3 00 

ANGKVS FWT I HPSANRYLTL DQVFKQQKRP NPELRRNMTI KTELPLGARR KMKPLLPRVS 360 

SYLVPIQFPV NQSLVLQPSV KVPLPLAASL MSSELARHSK RVRIAPKVLL AEEGIAPLSS 420 

AGPGKEEKLL FGEGFSPLLP VQTIKEEEIQ PGEEMPHLAR PIKVESPPLE EWPSPAPSFK 480 

EESSHSVJEDS SQSPTPRPKK SYSGLRSPTR CVSEMLVTQH RERRERSRSR RKQHLLPPCV 540 

DEPELLFSEG PSTSRWAAEL PFPADSSDPA SQLSYSQEVG GPFKTPIKET LPISSTPSKS 600 

VLPRTPESWR LTPPAKVGGL DFSPVQTPQG ASDPLPDPLG LMDLSTTPLQ SAPPLESPQR 660 

LLSSEPLDLI SVPFGNSSPS DIDVPKPGSP EPQVSGLAAN RSLTEGLVLD TMNDSLSKIL 720 
LDISFPGLDE DPLGPDNINW SQFIPELQ 



Seq ID NO: 72 DNA sequence 
Nucleic Acid Accession #: U74612.1 
Coding sequence: 178-2583 

1 11 21 31 41 51 

I 1 I I 1 I 

GGCACGAGGG GGACCCGGCC GGTCCGGCGC GAGCCCCCGT CCGGGGCCCT GGCTCGGCCC 60 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG GCGGCGACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG ATTCATAATG 180 

AAAACTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA % 240 

AATGCCCCAA GTGAAACAT C AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 300 

AATCAAGCAG AGGCCTCCAA GGAAGTGGCA GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 3 60 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 420 

GCTAAT ATT C ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AG ACCTGTG C AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 7 80 

CAAGAGATGG AGGAAAAGGA GAATTGT CAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCG C CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 

AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCGGGA GACGTCTGCC 1080 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACCGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC CCGAGCACTT GGAATCACAG 120 0 

CAGAAACGAC CGAAT CCAGA GCTCCGCCGG AACATGACCA TCAAAACCGA ACTCCCCCTG 12 60 

GGCGCACGGC GGAAGATGAA GCCACTGCTA CCACGGGTCA GCTCATACCT GGTACCTATC 1320 

CAGTTCCCGG TGAACCAGTC ACTGGTGTTG CAGCCCTCGG TGAAGGTGCC ATTGCCCCTG 1380 

GCGGCTTCCC TCATGAGCTC AGAGCTTGCC CGCCATAGCA AGCGAGTCCG CATTGCCCCC 1440 

AAGGTTTTTG GGGAACAGGT GGTGTTTGGT TACATGAGTA AGTTCTTTAG TGGCG AT CTG 1500 

CGAGATTTTG GTACACCCAT CACCAGCTTG TTTAATTTTA TCTTTCTTTG TTTATCAGTG 1560 

CTGCTAGCTG AGGAGGGGAT AGCTCCTCTT TCTTCTGCAG GACCAGGGAA AGAGGAGAAA 1620 

CTCCTGTTTG GAGAAGGGTT TTCTCCTTTG CTTCCAGTTC AGACTATCAA GGAGGAAGAA 1680 

ATCCAGCGTG GGGAGGAAAT GCCACACTTA GCGAGACCCA TCAAAGTGGA GAGCCCTCCC 1740 

TTGGAAGAGT GGCCCTCCCC GGCCCCATCT TTCAAAGAGG AATCATCTCA CTCCTGGGAG 1800 

GATTCGTCCC AATCTCCCAC CCCAAGACCC AAGAAGTCCT ACAGTGGGCT TAGGTCCCCA 1860 

ACCCGGTGTG TCT CGGAAAT GCTTGTGATT CAACACAGGG AGAGGAGGGA GAGGAGCCGG 1920 

TCTCGGAGGA AACAGCATCT ACTGCCTCCC TGTGTGGATG AGCCGGAGCT GCTCTTCTCA 1980 

GAGGGGCCCA GTACTTCCCG CTGGGCCGCA GAGCTCCCGT TCCCAGCAGA CTCCTCTGAC 2040 

CCTGCCTCCC AG CTCAGCTA CTCCCAGGAA GTGGGAGGAC CTTTTAAGAC ACCCATTAAG 2100 

GAAACGCTGC CCATCTCCTC CACCCCGAGC AAATCTGTCC TCCCCAGAAC CCCTGAATCC 2160 

TGGAGGCT C A CGCCCCCAGC CAAAGTAGGG GGACTGGATT TCAGCCCAGT ACAAACCTCC 2220 

CAGGGTGCCT CTGACCCCTT GCCTGACCCC CTGGGGCTGA TGGATCTCAG CACCACTCCC 2280 

TTGCAAAGTG CTCCCCCCCT TGAATCACCG CAAAGGCTCC TCAGTTCAGA ACCCTTAGAC 2340 

CTCATCTCCG TCCCCTTTGG CAACTCTTCT CCCTCAGATA TAGACGTCCC CAAGCCAGGC 2400 

TCCCCGGAGC CACAGGTTT C TGGCCTTGCA GCCAATCGTT CTCTGACAGA AGGCCTGGTC 2460 

CTGGACACAA TGAATGACAG CCTCAGCAAG ATCCTGCTGG ACATCAGCTT TCCTGGCCTG 2520 

GACGAGGACC CACTGGGCCC TGACAACATC AACTGGTCCC AGTTTATTCC TGAGCTACAG 2580 

TAGAGCCCTG CCCTTGCCCC TGTGCTCAAG CTGTCCACCA TCCCGGGCAC TCCAAGGCTC 2640 

AGTGCACCCC AAGCCTCTGA GTGAGGACAG CAGGCAGGGA CTGTTCTGCT CCTCATAGCT 2700 

CCCTGCTGCC TGATTATGCA AAAGTAGCAG TCACACCCTA GCCACTGCTG GGACCTTGTG 2 760 

TTCCCCAAGA GTATCTGATT CCTCTGCTGT CCCTGCCAGG AGCTGAAGGG TGGGAACAAC 2 82 0 

AAAGGCAATG GTGAAAAGAG ATTAGGAACC CCCCAGCCTG TTTCCATTCT CTGCCCAGCA 2880 

GTCTCTTACC TTCCCTGATC TTTGCAGGGT GGTCCGTGTA AATAGTATAA ATTCTCCAAA 2940 

TTATCCTCTA ATTATAAATG TAAG CTTATT TCCTTAGATC ATTATCCAGA GACTGCCAGA 3 000 

AGGTGGGTAG GATGACCTGG GGTTTCAATT GACTTCTGTT CCTTGCTTTT AGTTTTGATA 30 60 

GAAGGGAAGA CCTGCAGTGC ACGGTTTCTT CCAGGCTGAG GTACCTGGAT CTTGGGTTCT 3120 

TCACTGCAGG GACCCAGACA AGTGGATCTG CTTGCCAGAG TCCTTTTTGC CCCTCCCTGC 3180 

CACCTCCCCG TGTTTCCAAG TCAGCTTTCC TGCAAGAAGA AATCCTGGTT AAAAAAGTCT 3240 

TTTGTATTGG GTCAGGAGTT GAATTTGGGG TGGGAGGATG GATGCAACTG AAGCAGAGTG 3300 

TGGGTGCCCA GATGTGCGCT ATTAGATGTT TCTCTGATAA TGTCCCCAAT CATACCAGGG 3360 

AGACTGGCAT TGACGAGAAC TCAGGTGGAG GCTTGAGAAG GCCGAAAGGG CCCCTGACCT 3420 

GCCTGGCTTC CTTAGCTTGC CCCTCAGCTT TGCAAAGAGC CACCCTAGGC CCCAGCTGAC 3480 

CGCATGGGTG TGAGCCAGCT TGAGAACACT AACTACTCAA TAAAAGCGAA GGTGGACAAA 3540 
AAAAAAAAAA AAAAA 

Seq ID NO: 73 Protein sequence: 
Protein Accession #: AAC51128.1 
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1 11 21 31 41 51 

I i I 1 I I 

MKTSPRRPLI LKRRRLPLPV QNAPSETSEE EPKRSPAQQE SNQAEASKEV AESNSCKFPA 
GIKI INHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPNKFILIS CGGAPTQPPG 
LRPQTQTSYD AKRTEVTLET LGPKPAARDV NLPRPPGALC EQKRETCADG EAAGCT INNS 
LSNIQWLRKM SSDGLGSRSI KQEMEEKENC HLEQRQVKVE EPSRPSASWQ NSVSERPPYS 
YMAMIQFAIN STERKRMTLK DIYTWIEDHF PYFKHIAKPG WKNSIRHNLS LHDMFVRETS 
ANGKVSFWTI HPSANRYLTL DQVFKPLDPG SPQLPEHLES QQKRPNPELR RNMTIKTELP 
LGARRKMKPL LPRVSSYIiVP IQFPVNQSLV LQPSVKVPLP LAASLMSSEL ARHSKRVRIA 
PKVFGEQWF GYMSKFFSGD LRDFGTPITS LFNFIFLCLS VLLAEEGIAP LSSAGPGKEE 
KLLFGEGFSP LLPVQTIKEE EIQPGEEMPH LARPIKVESP PLEEWPSPAP SFKEESSHSW 
EDSSQSPTPR PKKSYSGLRS PTRCVSEMLV IQHRERRERS RSRRKQHLLP PCVDEPELLF 
SEGPSTSRWA AELPFPADSS DPASQLSYSQ EVGGPFKTP I KETLPISSTP SKSVLPRTPE 
SWRLTPPAKV GGLDFSPVQT SQGASDPLPD PLGLMDLSTT PLQSAPPLES PQRLLSSEPL 
DLISVPFGNS SPSDIDVPKP GSPEPQVSGL AANRSLTEGL VLDTMNDSLS KILLDISFPG 
LDEDPLGPDN INWSQFIPEL Q 

Seq ID NO: 74 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 111-416 

1 11 21 31 41 51 

l I l I l l 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTTTG TCCAAACACA CACATCTCAC 
TCATCCTTCT ACTCGTGACG CTTCCCAGCT CTGGCTTTTT GAAAGCAAAG ATGAGCAACA 
CTCAAGCTGA GAGGTCCATA ATAGGCATGA TCGACATGTT TCACAAATAC ACCAGACGTG 
ATGACAAGAT TGAGAAGCCA AGCCTGCTGA CGATGATGAA GGAGAACTTG CCCAACTTCC 
TTAGTGCCTG TGACAAAAAG GGCACAAATT ACCTCGCCGA TGTCTTTGAG AAAAAGGACA 
AGAATGAGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA GACATAGCCA 
CAGACTACCA CAAGCAGAGC CATGGAGCAG CGCGCTGTTC CGGGGGCAGC CAGTGACCCA 
GCCCCACCAA TGGGCCTCCA GAGACCCCAG GAACAATAAA ATGTCTTCTC CCACCAGA 

Seq ID NO: 75 Protein sequence? 
Protein Accession # : Eos sequence 

1 11 21 31 41 51 

l l I l I l 

MSNTQAERSI IGMIDMFHKY TRRDDKIEKP SLLTMMKENF PNFLSACDKK GTNYLADVFE 
KKDKNEDKKI DFSEFLSLLG DIATDYHKQS HGAAPCSGGS Q 



Seq ID NO: 76 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 111-416 

1 11 21 31 41 51 

I I 1 1 I I 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTTTG TCCAAACACA CACATCTCAC 
TCATCCTTCT ACTCGTGACA CTTCCCAGTT CTGGCTTTTT GAAAGCAAAG ATGAGCAACA 
CTCAAGCTGA GAGGTCCATA ATAGGCATGA TCGACATGTT TCACAAATAC ACCGGACGTG 
ATGGCAAGAT TGAGAAGCCA AGCCTGCTGA CGATGATGAA GGAGAACTTC CCCAATTTCC 
TCAGTG CCTG TGACAAAAAG GGCATACATT ACCTCGCCAC TGTCTTTGAG AAAAAGGACA 
AGAATGAGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA GACATAGCCG 
CAGACTACCA CAAGCAGAGC CATGGAGCGG CGCCCTGTTC TGGGGGAAGC CAGTGATCCA 
GCCCCACCAA GGGGCCTCCA GAGACCCCAG GAACAATAAG TGTCTCCTCC CACCAGA 

Seq ID NO: 77 Protein sequence: 
Protein Accession #: XP__048124.1 

1 11 21 31 41 51 

I 1 I I I I 

MSNTQAERSI IGMIDMFHKY TGRDGKIEKP SLLTMMKENF PNFLSACDKK GIHYLATVFE 
KKDKNEDKKI DFSEFLSLLG DIAADYHKQS HGAAPCSGGS Q 



Seq ID NO: 78 DNA sequence 
Nucleic Acid Accession #: Z73678 
Coding sequence: 253-2433 

1 11 21 

I I I 

GGGGTGGTGC AGGGCAGGGG TGGTATATCC 
CAGAGAGGGA CGAACCAGGG TGGAAGCGCC 
CCTCGCACTC TATGGCCGTA GGGAGCCGCT 
CGCTGCACCG CACCTCGCCT CGCCTCTCTG 
CCTCCCGCCA CCATGAACCA CTCGCCGCTC 
GACCAGGACA ACTCCACGTT GGCTTTGCCG 
GGCAGGCAGC GCGTGCAGGA GCAGGTGATG 
TCCCAGTCGT CCACCCTGAG CCACTCCAAT 
AATTACAACT ATGGGACCAC CAGCAGGAGC 
GGCTCATGGG GATATCCGAT CTACAATGGA 
TTCAGCTCCT ACAGCCAGAT GGAGAACTGG 
ACCACCGGCG CAGGCAGCGA CATCTGCTTC 
CCCGACCTCT ACTGTGACCC ACGGGGCACC 
CAGAAGACCA CCCAGAACCG CTACAGCTTT 
AAGAAGTGCC CTGTGCGCCC GCCCTCTTGT 



1 



31 41 51 

I I I 

TGTCTGACGG AGGGCGGGCC TCGCCAGTGC 
AGGAGCAGCT GCAGGGAGCC CTCACGCGGA 
GAGAGCGAGA AGAGCACGCT CCTGCCCGCC 
CTCTCCTAGG CCCCGGCCGC GCGCCACCCG 
AAGACCGCCT TGGCGTACGA ATGCTTCCAG 
TCGGACCAAA AGATGAAAAC AGGCACGTCT 
ATGACCGTCA AGCGGCAGAA GTCCAAGTCT 
CGAGGTTCCA TGTATGATGG CTTGGCTGAC 
AGCTACTACT CCAAGTTCCA GGCAGGGAAT 
ACCCTCAAGC GGGAGCCTGA CAACAGGCGC 
AGCCGGCACT ACCCCCGGGG CAGCTGTAAC 
ATGCAGAAAA TCAAGGCGAG CCGCAGTGAG 
CTGCGCAAGG GCACGCTGGG CAGCAAGGGC 
TACAGCACCT GCAGTGGTCA GAAGGCCATA 
GCCTCCAAGC AGGACCCTGT GTATATCCCG 
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CCCATCTCCT GCAACAAGGA CCTGTCCTTT GGCCACTCTA GGGCCAGCTC CAAGATCTGC 960 

AGTGAGGACA TCGAGTGCAG TGGGCTGACC ATCCCCAAGG CTGTGCAGTA CCTGAGCTCC 102 0 

CAGGATGAGA AGTACCAGGC CATTGGGGCC TATTACATCC AGCATACCTG CTTCCAGGAT 1080 

GAAT CTGCCA AGCAACAGGT CTATCAGCTG GGAGGCATCT GCAAGCTGGT GGACCTCCTC 1140 

CGCAGCCCCA ACCAGAACGT CCAGCAGGCC GCGGCAGGGG CCCTGCGCAA CCTGGTGTTC 12 00 

AGGAGCACCA CCAACAAGCT GGAGACCCGG AGGCAGAATG GGATCCGCGA GGCAGTCAGC 1260 

CTCCTGAGGA GAACCGGGAA CGCCGAGATC CAGAAGCAGC TGACTGGGCT GCTCTGGAAC 13 20 

CTGTCTTCCA CTGACGAGCT GAAGGAGGAA CTCATTGCCG ACGCCCTGCC TGTTCTGGCC 1380 

GACCGCGTCA TCATTCCCTT CTCTGGCTGG TGCGATGGCA ATAGCAACAT GTCCCGGGAA 1440 

GTGGTGGACC CTGAGGTCTT CTTCAATGCC ACAGGCTGCT TGAGGAACCT GAGCTCGGCC 1500 

GATGCAGGCC GCCAGACCAT GCGTAACTAC TCAGGGCTCA TTGATTCCCT CATGGCCTAT 1560 

GTCCAGAACT GTGTAGCGGC CAGCCGCTGT GACGACAAGT CTGTGGAAAA CTGCATGTGT 162 0 

GTTCTGCACA ACCTCTCCTA CCGCCTGGAC GCCGAGGTGC CCACCCGCTA CCGCCAGCTG 1680 

GAGTATAACG CCCGCAACGC CTACACCGAG AAGTCCTCCA CTGGCTGCTT CAGCAACAAG 1740 

AGCGACAAGA TGATGAACAA CAACTATGAC TGCCCCCTGC CTGAGGAAGA GACCAACCCC 1800 

AAGGGCAGCG GCTGGTTGTA CCATT CAGAT GCCATCCGCA CCTACCTGAA CCTCATGGGC 1860 

AAGAGCAAGA AAGATGCTAC CCTGGAGGCC TGTGCTGGTG CCCTGCAGAA CCTGACAGCC 1920 

AGCAAGGGGC TGATGTCCAG TGGCATGAGC CAGTTGATTG GGCTGAAGGA AAAGGGCCTG 1980 

CCACAAATTG CCCGCCTCCT GCAATCTGGC AACTCTGATG TGGTGCGGTC CGGAGCCTCC 2040 

CTCCTGAGCA ACATGTCCCG CCACCCTCTG CTGCACAGAG TGATGGGGAA CCAGGTGTTC 2100 

CCGGAGGTGA CCAGGCTCCT CACCAGCCAC ACTGGCAATA CCAGCAACTC CGAAGACATC 2160 

TTGTCCTCGG CCTGCTACAC TGTGAGGAAC CTGATGGCCT CGCAGCCACA ACTGGCCAAG 2220 

CAGTACTTCT CCAGCAGCAT GCTCAACAAC ATCATCAACC TGTGCCGAAG CAGTGCCTCA 22 80 

CCCAAGGCCG CAGAAGCTGC CCGGCTTCTC CTGTCTGACA TGTGGTCCAG CAAGGAACTG 2 340 

CAGGGTGTCC TCAGACAGCA AGGTTTCGAT AGGAACATGC TGGGAACCTT AGCTGGGGCC 240 0 

AACAGCCTCA GGAACTTCAC CTCCCGATTC TAAGAAGAGA CTGTCCAAGC AAGTTAGGCT 2460 

TGCAGGAAGA TATGACCCAG CTGAGAAGCC CTCAGGCCTC GCTGGATGGG GTTTTCTGTC 2 520 

CATCCTGTGC AGTATTTGGG AAAGTTCACA AGAAACTGAG AAGAAACCTA AAAACTGTGG 2580 

ATAGTGGAAA GATTTT T AGA TTTTTTTTTT CCTTGGGGAA ACTGGCAGGC AATGGGGGTT 2 640 

AGGGAGGTTG GGGCGGGGGG GGCTTTCTTG AGTTAAAGGG GCTTATATGT GATGTCAATA 2700 

TTTCTTCCTC TGAGAAATGG TATATATATG TGTCTAATGT AAGTGTGTGC ATGCATGTGC 2760 

GCGTGCATGT GTGTGTGTGT GAGTGTCTTA AAGCATAACC ACAAACTGCA AAAAGCTAGG 2 820 

TAAGCTATTT TGTTGCAGCT CATAAGGTGG TGAAAAGGAC TCTCCTGTGT TTCTTACTCA 2 880 

TAGGCAAGGA CAACATGTGC TTTTTGGTGA GCTGCTCATA ATTCCTGAAA TGTGTGGTGC 2 940 

CAGGGCAAGG GGGCCATCAC TGCAGTCAGG CCCTCAGAGG AGTCCTGCAG GCTTCCTACC 3 000 

AGTGGTCTCC AAGGGTGCAG GAGTAACTGG GGCTGGGCCA GCCTCCCCCC TTACAAGGCT 3 0 60 

GCTTTCCACG AAGGGAGGTC TGGTGTATCT CATGGGAGAA TCTGGGGTGT CTGTAGTGTC 312 0 

ACCCCTCCAG CAGCGCCACA AGGACTGAGG TTGGGTAGGT GTGAGGTTCC AGAGGACAGC 3180 

AGGACACTCT CGCATACTTT GCCAAATGAG GCCTGCTCAG AGGAGTAGGA GCTGAAAGAT 3240 

GGTGCCTTCC ACCCTCTTGG GCTGTGTGCC CATCAGAGCA GGCTCAGCCT GCAAAGGCCC 3 3 00 

TGCATT CAGA GGTCTTGTAA TCTACTTGTT GCAGGAGAAA GAAGGTAAAA AATGATTTTT 33 60 

TTAAGAAAAG CTATTTTATT GCAGCTCTTT CCCAAGAGCT GTTCTGGGAA TGGCTGGTCT 342 0 

TCATATTCCC AGTGGAGAGG GGAACAAGTG GGGCTGGGCA TATACCTATT CCGGCTTCTA 3480 

GTGGGATGGA GTTGGGGTAT AGAAATTAAC CAGGAAGATG TTTCCACCAA GCCTGCTGTG 3540 

AGTCAATTGA GGGAGTGTTT GGGTCCCAGG AGACTTGGAC GGGGGGAGTT TGGGTAGACT 3 60 0 

AGGAAAGGAA AGTGCCATAT CAGGGTACCG GTACCGGCAA GCTCACATCT CAGCCAGGGG 3 660 

CCATGCCCCA CTTCCCCTGA CCCCAGCTGT CTTGTCTCCA CTCTGTGAAA CCCACAGGGG 3720 

ATGTGATAAA CAGGGCTATT AGGGGTATCA GCCACGT CGA GCCCCCAGAC TCTGTGCACT 3 780 

TCAGACCAGC AGCAGCAGGA GGGCTCCCGA GGGCCTTATG AGAAAACCTG TGTGGACATC 3 840 

CCTTGGTGTA CACTAAGACA GAGCAGAGCC CAGCGCTCCC AAGCCTTCCT CCTTCCAGCT 3 90 0 

TCTACCTCCA TGCTAGCATT GCTGGTGTTA GAGAGGAATT AACTTCCTGG TCTGTGCCCT 3 960 

TCTCTAGAAG AATATAAGAT GCTCCTCCTC CTCACCCCTT CTCAGCCTCC TCCCAAGTCT 4020 

TCCTCTTCTG CACCACCCCC GAGTCCAAAC CCACCTCTTG CCCCAGCATT CAGGCTGGAA 40 80 

AACACTGATG TGG ACT CAGT ATGACAACTG AGATGGGGGA AGCCAGACAT GTGAGGACGC 4140 

TGTCCTCCGA GAGGTGTCCC CGGCTGTTAG CCAGCTGTGC TGTGGTGCTG TGGGTCTGTC 42 00 

ATACCCTCCC TTGCTTCTGT TCACACTGGG AGGCCCACTC CTGGCTCACC TCTCCCTCTC 42 60 

AGGGACCCAC GTGGGAGCCT GGATCCCTGG ACTGTCCTGG GCATAGGTTT CAGGGGCCTC 432 0 

CTTTGTTGTC ATCAGAACCC AGAGGAATTC TTCTCCTAAA AAATACGTAT GGCATACCAA 43 80 

TCTGTGCGGG GCAGTGTCCT AAGCACTTAG ACTA CAT CAG GGAAGAACAC AGACCACATC 4440 

CCCGTCCTCA TGCGGCTTAT GTTTTCTGGA GGAAAGTGGA GACACAAGTC CTTGGCTTTA 450 0 

GGGCTCCCCC GGCTGGGGGC TGTGCAGTCC GGTCAGGGCG GGAGGGGAAA TGCACCGCTG 4560 

CATGTGAACC TTACCAGCCC AGGCGGATGC CCCTTCCCCT TAGCACTACC CTGGCCTCCT 4620 

GCATCCCCTC GCCTCATGTT CCTCCCACCT TCAAAGAATG AAGAGCCCCA TGGGCCCAGC 4680 

CCCTGCCCTG GGAACCAGGC AGCCTTCCAG ACCTCAGGGG CTGAGGCAGA CTATTAGGGC 4740 

AGGGCTGACT TTGGTGACAC TGCCCATTCC CTCTCAGGCC AGCTCAGGTC ACCCGGGCCT 4800 

CTGACCCAGG CCTGTCACTT TGAGAGGGGC AAAACTGAGA GGGGCTTTTC CTAGAGAAAG 4860 

AGAACAAGGA GCTTGCCAGG CTTCATGTAG CCGACACACG TCTCAGGATT TTAAGT CCAC 4 92 0 

ATTGGCCTCA CACTAGCCTA GGCCAATGCC CAAAATAAGG AGTTCCAATT TGGGGCCAAA 4980 

TGAGGAAGGA CACAGACTCT GCCCTGGGAT CTCCTGTGCT AGCGGCCAAT GACAAATCCA 5040 

GTCATTGGCC ACCAGCCACC TCTGCAGTGG GGACCACACT AGCAGCCCTG ACTCCACACT 5100 

CCTCCTGGGG ACCCAAGAGG CAGTGTTGCT GTCTGCGTGT CCACCTTGGA ATCTGGCTGA 5160 

ACTGGCTGGG AGGACCAAGA CTGCGGCTGG GGTGGGCAGG GAAGGGAAGC CGGGGGCTGC 5220 

TGTGAGGGAT CTTGGAGCTT CCCTGTAGCC CACCTTCCCC TTGCTTCATG TTTGTAGAGG 5280 

AACCTTGTGC CGGCCAGGCC CAGTTTCCTT GTGTGATACA CTAATGTATT TGCTTTTTTT 5340 
GGAAATAGAG AAAATCAATA AATTGCTAGT GTTTCTTTGA AAAAAAAAA 

Seq ID NO: 79 Protein sequence: 
Protein Accession #: CAA98022.1 

1 11 21 31 41 51 

1 I I I I 1 

MNHSPLKTAL AYECFQDQDN STIALPSDQK MKTGTSGRQR VQEQVMMTVK RQKSKSSQSS 60 

TL SKSNRGSM YDGLADNYNY GTTSRSSYYS KFQAGNGSWG YPIYNGTLKR EPDNRRFSSY 12 0 

SQMENWSRHY PRGSCNTTGA GSDICFMQKI KASRSEPDLY CDPRGTLRKG TLGSKGQKTT 180 

QNRYSFYSTC SGQKAIKKCP VRPPSCASKQ DPVYIPPISC NKDLSFGHSR ASSKICSEDI 240 

ECSGLTIPKA VQYLSSQDEK YQAIGAYYIQ HTCFQDESAK QQVYQLGGIC KLVDLLRSPN 300 

QNVQQAAAGA LRNLVFRSTT NKLETRRQNG IREAVSLLRR TGNAEIQKQL TGLLWNLSST 3 60 
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DELKEELIAD ALPVLADRVI IPFSGWCDGN SNMSREWDP EVFPNATGCL RNLSSADAGR 420 

QTMHNYSGLI DSLMAYVQNC VAASRCDDKS VENCMCVLHN LSYRLDAEVP TRYRQLEYNA 480 

RNAYTEKSST GCFSNKSDKM MNNNYDCPLP EEETNPKGSG WLYHSDAIRT YLNLMGKSKK 540 

DATIiEACAGA LQNLTASKGL MSSGMSQLIG LKEKGLPQIA RIiLQSGNSDV VRSGASLLSN 600 

MSRHPLLHRV MGNQVFPEVT RLLTSHTGNT SNSEDILSSA CYTVRNLMAS QPQLAKQYFS 660 

SSMLNNIINIi CRSSASPKAA EAARLLLSDM WSSKELQGVL RQQGFDRNML GTLAGANSLR 720 
NFTSRF 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 80 DNA sequence 

Nucleic Acid Accession #: NM_006516.1 

Coding sequence: 180-1658 



TAGTCGCGGG 
GTCAGAGTCG 
CGCACGCCCG 
TGGAGCCCAG 
TTGGCTCCCT 
AGGAGTTCTA 
TCACCACGCT 
TCTCTGTGGG 
TGCTGGCCTT 
TGCTGATCCT 
CCATGTATGT 
AGCTGGGCAT 
GCAACAAGGA 
GCATCGTGCT 
AGAACCGGGC 
TGCAGGAGAT 
AGCTGTTCCG 
CCCAGCAGCT 
CGGGGGTGCA 
CTGTCGTGTC 
TCGCTGGCAT 
TACCCTGGAT 
TGGGTCCTGG 
CAGCTGCCAT 
GCTTCCAGTA 
TGGTTCTGTT 
ATGAGATCGC 
AGCTGTTCCA 
GCCTGCTCCC 
AACCTGACAG 
CCAGAAGAAT 
AAATCTATTC 
ATATCAGCCT 
GAGGGTGGAG 
CTGGACCTAT 
GAGGTGGCTA 
CATTAGGATT 
CCTGAGACCA 
GCCGGGTTCT 
GGGAGCCTGC 
TGCAAGATAT 
ATATCTGGAC 
TATAAATGGC 
TTTGGATGGG 
GACT CAGGAT 
TTTGATCCCT 
ATCACATATT 
AGGCTTGAAA 



11 
I 

TCCCCGAGTG 
CAGTGGGAGT 
TCGCCACCCG 
CAGCAAGAAG 
GCAGTTTGGC 
CAACCAGACA 
CTGGTCCCTC 
CCTTTTCGTT 
CGTGTCCGCC 
GGGCCGCTTC 
GGGTGAAGTG 
CGTCGTCGGC 
CCTGTGGCCC 
GCCCTTCTGC 
CAAGAGTGTG 
GAAGGAAGAG 
CTCCCCCGCC 
GTCTGGCATC 
GCAGCCTGTG 
GCTGTTTGTG 
GGCGGGTTGT 
GTCCTATCTG 
CCCCATCCCA 
TGCCGTTGCA 
TGTGGAGCAA 
CTTCATCTTC 
TTCCGGCTTC 
TCCCCTGGGG 
AGCAGCCCTA 
ATGTCAGCCG 
ATTCAGGACT 
AGACAAGCAA 
GAGTCTCCTG 
ACTAAGCCCT 
GTCCTAAGGA 
TGGCCACCCG 
TGCCCCTTCC 
GTTGGGAGCA 
AGTCTCCTTT 
AAACTCACTG 
TTATATATAT 
AAGCCAACTT 
TGGTTTTTAG 
AGTGAGACAG 
CCAGTCCCTT 
GTTACCCAGA 
TGATAGTTGG 
TCGCATTATT 



21 

I 

AGCACGCCAG 
CCCCGGACCG 
CGTACCCGGC 
CTGACGGGTC 
TACAACACTG 
TGGGTCCACC 
TCAGTGGCCA 
AACCGCTTTG 
GTGCTCATGG 
ATCATCGGTG 
TCACCCACAG 
ATCCTCATCG 
CTGCTGCTGA 
CCCGAGAGTC 
CTAAAGAAGC 
AGTCGGCAGA 
TACCGCCAGC 
AACGCTGTCT 
TATGCCACCA 
GTGGAGCGAG 
GCCATACTCA 
AGCATCGTGG 
TGGTTCATCG 
GGCTTCTCCA 
CTGTGTGGTC 
ACCTACTTCA 
CGGCAGGGGG 
GCTGATTCCC 
AGGATCTCTC 
AGCCGGGCCT 
TAACGGCTCC 
CAGGTTTTAT 
TGCCCACATC 
GTCGAGACAC 
CACACTAATC 
TTCTGCTGGC 
CATCTCTTCC 
CTGGAGTGCA 
GCACTGAGGG 
CTCAAGAAGA 
TTTTGGTTGT 
GTAAATACAC 
AAACATGGTT 
AAGTAAGTGG 
ACACGTACCT 
GAATATATAC 
TGTTCAAAAA 
TTGAATGTGA 



31 

1 

GGAGCAGGAG 
GAGCACGAGC 
GCAGCCAGAG 
GCCTCATGCT 
G AGT CAT CAA 
GCTATGGGGA 
TCTTTTCTGT 
GCCGGCGGAA 
GCTTCTCGAA 
TGTACTGCGG 
CCTTTCGTGG 
CCCAGGTGTT 
GCATCATCTT 
CCCGCTTCCT 
TGCGCGGGAC 
TGATGCGGGA 
CCATCCTCAT 
TCTATTACTC 
TTGGCTCCGG 
CAGGCCGGCG 
TGACCATCGC 
CCATCTTTGG 
TGGCTGAACT 
ACTGGACCTC 
CCTACGTCTT 
AAGTTCCTGA 
GAGCCAGCCA 
AAGTGTGAGT 
AGGAG CACAG 
GGGGCTCCTT 
AGGATTTTAA 
AATTTTTTTA 
CCAGGCTTCA 
TTGCCTTCTT 
GAACTATGAA 
CTGGATCTCC 
TACCCAACCA 
GGGAGGAGAG 
CCACACTATT 
CATGGAGACT 
CAATATTAAA 
CACCTCACTC 
TTGAAATGCT 
GGT TGCAACC 
CTCATCAGTG 
ATTCTTTATC 
AACACTAGTT 
AGGGAA 



41 

I 

ACCAAACGAC 
CTGAGCGGGA 
CCACCAGCGC 
GGCTGTGGGA 
TGCCCCCCAG 
GAGCATCCTG 
TGGGGGCATG 
TTCAATGCTG 
ACTGGGCAAG 
CCTGACCACA 
GGCCCTGGGC 
CGGCCTGGAC 
CATCCCGGCC 
GCTCATCAAC 
AGCTGACGTG 
GAAGAAGGTC 
CGCTGTGGTG 
CACGAGCATC 
TATCGTCAAC 
GACCCTGCAC 
GCTAGCACTG 
CTTTGTGGCC 
CTTCAGCCAG 
AAATTTCATT 
CATCATCTTC 
GACTAAAGGC 
AAGTGATAAG 
CGCCCCAGAT 
GCAGCTGGAT 
TCTCCAGCCA 
CAAAAGCAAG 
TTACTGATTT 
CCCTGAATGG 
CACCCAGCTA 
CTACAAAGCT 
CCACTCTAGG 
CTCAAATTAA 
GGGAAGGGCC 
ACCATGAGAA 
CCTGCCCTGT 
TACAGACACT 
CTGTTACTTA 
TGTGGATTGA 
ACTGCAACGG 
TCCTCTTGCT 
TTGACATTCA 
TTGTGCCAGC 



51 

I 

GGGGGTCGGA 
GAGCGCCGCT 
AGCGCTGCCA 
GGAGCAGTGC 
AAGGTGATCG 
CCCACCACGC 
ATTGGCTCCT 
ATGATGAACC 
TCCTTTGAGA 
GGCTTCGTGC 
ACCCTGCACC 
TCCATCATGG 
CTGCTGCAGT 
CGCAACGAGG 
ACCCATGACC 
ACCATCCTGG 
CTGCAGCTGT 
TTCGAGAAGG 
ACGGCCTTCA 
CTCATAGGCC 
CTGGAGCAGC 
TTCTTTGAAG 
GGTCCACGTC 
GTGGGCATGT 
ACTGTGCTCC 
CGGACCTTCG 
ACACCCGAGG 
CACCAGCCCG 
GAGACTTCCA 
GCAATGATGT 
ACTGTTGCTC 
TGTTATTTTT 
TTCCATGCCT 
AT CTGTAGGG 
TCTATCCCAG 
GGTCAGGCTC 
TCTTTCTTTA 
AGTCTGGGCT 
GAGGGCCTGT 
TGTGTATAGA 
AAGTTATAGT 
CCTAAACAGA 
GGGTAGGAGG 
CTTAGACTTC 
CAAAAATCTG 
AGGCATTTCT 
CGTGATGCTC 



Seq ID NO: 81 Protein sequence: 
Protein Accession #: NP_006507.1 



MEPSSKKLTG 
LTTLWSLSVA 
MLILGRFIIG 
GNKDLWPLLL 
LQEMKEESRQ 
AGVQQPVYAT 
IiPWMSYLSIV 
CFQYVEQLCG 
ELPHPLGADS 



11 

I 

RLMIiAVGGAV 
IFSVGGMIGS 
VYCGLTTGFV 
SIIFIPALLQ 
MMREKKVTIL 
IGSGIVNTAP 
AIFGFVAFFE 
PYVFIIFTVL 
QV 



21 

I 

LGSLQFGYNT 
FSVGLFVNRF 
PMYVGEVSPT 
CIVLPFCPES 
ELFRSPAYRQ 
TWSLFWER 
VGPGPIPWFI 
LVLFFIFTYF 



31 

I 

GVINAPQKVI 
GRRNSMLMMN 
AFRGALGTLH 
PRFLLINRNE 
PILIAWLQL 
AGRRTLHLIG 
VAELFSQGPR 
KVPETKGRTF 



41 

I 

EEFYNQTWVH 
LLAFVSAVLM 
QLGIWGILI 
ENRAKSVLKK 
SQQLSGINAV 
LAGMAGCAI h 
PAAIAVAGFS 
DEIASGFRQG 



51 

I 

RYGESILPTT 
GFSKLGKSFE 
AQVFGLDSIM 
LRGTADVTHD 
FYYSTSIFEK 
MTIALALLEQ 
NWTSNFIVGM 
GASQSDKTPE 



Seq ID NO: 82 DNA sequence 
Nucleic Acid Accession #: BC001291 
Coding sequence: 44-541 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



60 
120 
180 
240 
300 
360 
420 
480 



1 11 21 31 41 51 

I 1 I I I I 

GGGGGCGCCG CGCGCTGACC CTCCCTGGGC ACCGCTGGGG ACGATGGCGC TGCTCGCCTT 
GCTGCTGGTC GTGGCCCTAC CGCGGGTGTG GACAGACGCC AACCTGACTG CGAGACAACG 



60 
12 0 



219 



WO 02/086443 

AGATCCAGAG GACTCCCAGC GAACGGACGA 
TGAGAGAGAA AACACTTTCG AGTGCCAGAA 
CTGCGTTATA GCGGCCGTGA AAATATTTCC 
CGCTGGTTGT GCAGCGATGG AGAGACCCAA 
5 GCCCATGCCC TTCTTTTACC TCAAGTGTTG 

ACCTATCAAC TCATCAGTGT TCAAAGAATA 
GCTGTGGCTG GCCATCCTCC TGCTGCTGGC 
AGCCACGGGA CTGCCACAGA CTGAGCCTTC 
ACCTGTTGCA TTAAACTTGT TTTCTGTTGA 

10 GGGATGGGAG AGTGGGGATC AGGTGCAGTT 

ACATTCAGAG GAAGTCCAGA TCTCCTGAGT 
AAATCAAACC TTGTAACTCA TTTATTGCTG 
CCTCTGAGGG CTTCAGTATT GATGGGGAGG 
TGCTGAGATG CTTCCGACCT TTCAGGTGAC 

15 GGGTGAAGAC ATCCCTGGAG TGAAGGACTC 

AGGGCTGCCC CCATTCCAGT GGTGGAGGCG 
CTACCAGATT CCAGGAGGCA GAAGATAACT 
ACCAGCTGGC ACAGGTGCAC AGATTCATAA 
ACTTAGGCCA AGTAGAGAGC ATCAGGGTAA 

20 CATCCATGGG GAGCTGAGAA ATCAGACTCA 

TTCAAAAGTT CACGAAAAAA AAAAAAAAAA 
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GGGTGACAAT 
CCCAAGGAGG 



GCCAGAGGAG 
TAAAATTCGC 
TGCTGGGAGC 
CTCCATTGCA 
CGGAGCATGG 
TTACCTCTTG 
GGCTCTTAAC 
AGTGATTTTG 
ATGGCCACTC 
GAGGCCTAAG 
GCAGGAACAC 
CTCAGCATGG 
CTGTGGATGG 
AATTGTGTTG 
ATTCCCACAC 
ATGGCGTTCA 
AAGTTCCACC 
AAAAAAAAAA 



AGAGTGTGGT 
TGCAAATGGA 
ATGGTTGCGA 
AAGCGGTTTC 
TACTGCAATT 
ATGGGTGAGA 
GCCGGCCTCA 
ACTCGCTCCA 
GTTTGACTTC 
CCT GAAGGGT 
GTGACAAGTT 
TTTTCCTTGA 
TACCACTCAT 
TGGGGGAGTC 
GGGGCAGTGG 
CTGCTTTTCC 
AAGAAACTTA 
GTGTGTGTTC 
TTTCTCTGTT 
AAAAACAAAT 
AAAAAAAAAA 



GTCATGTTTG 
CAGAGCCATA 
AGCAGTGCTC 
TCCTGGAAGA 
TAGAGGGGCC 
GCTGTGGTGG 
GCCTGTCTTG 
GACCGTTGTC 
CCAGGGTCTT 
TCTTTAACTC 
TTTCTCTTTG 
CTCCCCTCTG 
GGAGAGTATG 
TGAATGATTG 
GGCACACGTT 
TCAACCTTTC 
GACTTCACCC 
AACAT CTGAA 
AAGATGCAGC 
ACAAGGGGAC 
AAA 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



25 



30 



Seq ID NO: 83 Protein sequence: 
Protein Accession #: AAH01291 



11 



21 



31 



51 



41 

I I 1 l I I 

MAIiLALLLW ALPRVWTDAN LTARQRDPED SQRTDEGDNR VWCHVCEREN TFECQNPRRC 
KWTEPYCVIA AVKIPPRPFM VAKQCSAGCA AMERPKPEEK RFLLEEPMPF FYLKCCKIRY 
CNLEGPPINS SVFKEYAGSM GESCGGLWLA I LLLLAS I AA GLSIiS 



60 

iao 



35 



40 



45 



50 



55 



60 



65 



70 



75 



80 



85 



Seq ID NO: 84 DNA sequence 

Nucleic Acid Accession #: NM_022893 . 1 

Coding sequence: 229-2726 



1 

TGCGCCATCT 
TTTTCTCTGG 

CGCCCGCCGC 
AAGCAAGGCA 
ATTCTTACAG 
CTCCTCACCT 
GAGCACAAAC 
CCTTCCCCTT 
GTCACG CCAG 
GAACACATAG 
GGAGCTCTAA 
GATGAGCCCA 
CTCTTGCAAC 
AGTCCCCTGA 
CCACCTCTCC 
GGATCAGTAT 
CTGTTTAGTC 
GAAGAAATGG 
CCAATGGCTA 
AACACGTCTA 
CCATTCCAGC 
TCCGGCCCTC 
ACGTTCAAAT 
TACAAGTGCA 
AAGACGCACA 
GCCAGCTCCC 
TCCGTGGTGG 
GAGGAGGAAG 
CTGACGGAGA 
CACGAGAACA 
GACGTCATGC 
GTCCTGGGCG 
TGCGACGAAG 
CGCGGCTGCT 
AGCCCCAGCT 
CCCCCGGCCA 
GCCTCCAGGC 
GCCTCCTCGT 
GAGCTGGACG 
ATTAGTGGTC 
GAGTACTGTG 
ACGGGCGAAA 
CTCACCAGGC 
TGTAAGATGC 
GATCGAGTGT 
CTCCCACCTG 
CCTGTAGGAT 
ACGAAGCTAA 
TTCTTTTTTC 



11 

1 

TTTTTTGCTT 
TTGTATTATT 
AGTCTCCTTC 
CGCCGCCGCC 
AACCCCAGCA 
ATGATGAACC 
GTGGGCAGTG 
GGAAACAATG 
CACCAATCGA 
AAGATGACGA 
CAGATAAACT 
TCCCCACGCC 
GCAGCTACAC 
ACGCACAGAA 
CCCCGCGGGT 
ATGGGATTCA 
CGAGAGAGGC 
CACCACCGAG 
CCCTGGCCAC 
TGGAGCCTCC 
GCCCACCGCT 
CAGGTAGCAA 
CTCCCTCCCA 
TTCAGAGCAA 
ACCTGTGCGA 
TGCACAAATC 
CGGAACCCGG 
CCAAGTTCAA 
AGGAGGACGA 
GCGAGAGGGT 
GCT CGCGGGG 
AGGGCATGGT 
AGAAGCATAA 
ACTCGGTGGC 
CCCCGGGCGA 
CGCTGAGCCC 
CGATGCCCAA 
AGCTCAAAGA 
CGGAGCACTC 
GAGGGATCTC 
CGGGCACGGG 
GGAAAGTCTT 
GGCCTTATAA 
ACATGAAAAC 
CTTTTAGCGT 
TGAATAATGA 
ACACCCCCTT 
TTTTTTCTAG 
GAATATGAGA 
TTTTTCCTTT 



21 
I 

AAAAAAAAGC 
TCTAATTTAT 
TTTCTAACCC 
GCCGCCGCCG 
CTTAAGCAAA 
AGACCACGGC 
CCAGATGAAC 
CAATGGCAGC 
GATGAAAAAA 
TTGTTTATCA 
TCTGCACTGG 
TGGGATGAGT 
ATGTACAACT 
CACTCATGGA 
TGGTATCCCT 
TATTGCAGAC 
TTCCGGCCTG 
ACATCACTTG 
CCATCACCCG 
CGCCATGGAT 
GTCCCCAGGC 
GCCGCCCTTC 
GCCCCCGGTC 
CCTGGTGGTG 
CCACGCGTGC 
GTCCCCCATG 
CACCAGCGAC 
GAGCGAGAAC 
CGAGGAAGAG 
GGACTACGGC 
CGCGGTCGTG 
GCTCAGCTCC 
GCGCGGCCAC 
CGGCGAGTCG 
GTCGGCCTCG 
CTTCTCTAAG 
CACGGAGAAC 
TCCCTTCCTT 
CTCGGAGAAC 
GGGGCGCAGC 
CAGGCCCAGC 
CAAGAACTGT 
ATGCGAGCTG 
GCATGGCCAG 
GTACAGTACC 
TATAAAAACT 
TTTCACCAGT 
TCCCATGTGA 
GTGCTTGTCA 



31 
1 

CATGACGGCT 
TTTGGATGTC 
GGCTCTCCCG 
CCCGCCCCGC 
CGGGAATTCT 
CCGTTGGGAG 
TTCCCATTGG 
CTCTGCTTAG 
GCATCCAATC 
ACGTCATCTA 
AGGGGCCTCT 
GCAGAATATG 
TGCAAACAGC 
TTAAGAATCT 
TCAGGACTAG 
AATAACCCCT 
GCAGAAGGGC 
GACCCCCACC 
AGTGCCTTTG 
TTCTCTAGGA 
CGGCCCAGCC 
CTGGCGACGC 
AAGTCCAAGT 
CACCGGCGCA 
ACCCAGGCCA 
ACGGTCAAGT 
TTGGTGGGCA 
GACCCCAACC 
GAAGAAGAGG 
TTCGGGCTGA 
GGCGTGGGCG 
ATGCAGCACT 
CTGGCCGAGG 
GACCGCATAG 
GGGGGCCTGT 
CGCATCAAGC 
GTGTACTCGC 
AGCTTCGGAG 
GGGAGCTTGC 
GGCACGGGAA 
TCAAAAGAGG 
AGCAATCTCA 
TGCAACTATG 
GTGGGGAAGG 
CTGGAGAAAC 
GAATAGAGGT 
CCCTTTCCCC 
TTTAAACAAA 
CCAGCACACC 
TCCTTTATGT 



41 
I 

CTCCCACAAT 
AAAAGGCACT 
ATGTGAACCG 
AGCCCACCAT 
CGCCCGAGCC 
CTCCAGAAGG 
GGGACATTCT 
AAAAAGCTGT 
CCGTGGAGGT 
GAAGAATTTG 
CCTCCCCTCG 
CCCCGCAGGG 
CATTCACCAG 
ACTTAGAAAG 
GTGCAGAATG 
TTAACCTGCT 
GCTTTCCACC 
GCATAGAGCG 
ACAGGGTGCT 
GACTTAGAGA 
CTATGCAAAG 
CCCCCCTCCC 
CATGCGAGTT 
GCCACACGGG 
GCAAGCTGAA 
CCGACGACGG 
GCGCCAGCAG 
TGATCCCGGA 
AGGAAGAGGA 
GCCTGGAGGC 
ACGAGAGCCG 
TCAGCGAGGC 
CCGAGGGCCA 
ACGATGGCAC 
CCAAAAAGCT 
TCGAGAAGGA 
AGTGGCTCGC 
ACTCCAGACA 
GCTTCTCCAC 
GTGGAGGGAG 
GCAGACGCAG 
CTGTCCACAG 
CCTGTGCCCA 
ACGTTTACAA 
ACATGAAAAA 
ATATTAATAC 
ATCGCCCTCC 
CAAACAAACA 
TGTTTTTTTT 
TCTCACCGTT 



51 
I 

TCATCTTCCC 
GATGAAGATA 
AGCCGTCGTC 
GTCTCGCCGC 
TCTTGAAGCC 
GGAT CATGAC 
TATTTTTATC 
GGATAAGCCA 
TGGCATCCAG 
CCCCAAACAG 
TTCTGCACAT 
TATTTGTAAA 
TGCATGGTTT 
CGAACACGGA 
TCCTTCCCAG 
AAGAATACCA 
CACTCCCCCC 
CCTGGGGGCG 
GCGGTTGAAT 
GCTGGCAGGG 
GTTACTGCAA 
TCCTCTGCAA 
CTGCGGCAAG 
CGAGAAGCCC 
GCGCCACATG 
TCTCTCCACC 
CGCGCTCAAG 
GAACGGGGAC 
GGAGGAGGAG 
GGCGCGCCAC 
CGCCCTGCCC 
CTTCCACCAG 
CAGGGACACT 
TGTTAATGGC 
GCTGCTGGGC 
GTTCGACCTG 
CGGCTACGCG 
ATCGCCTTTT 
ACCGCCCGGG 
CACGCCCCAT 
CGACACT TGT 
GAGAAGCCAC 
GAGTAGCAAG 
ATGTGAAATT 
ATGGCACAGT 
CCCTCCCTCA 
AGCCCCACTC 
AACAGAAGTA 
CTTTTTCTTT 
TGAATGCATG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2380 
2940 
3000 



220 
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ATCTGTATGG GGCAATACTA TTGCATTTTA CGCAAACTTT GAGCCTTTCT CTTGTG CAAT 3060 

AATTTACATG TTGTGTATGT TTTTTTTTAA ACT T AG ACAG CATGTATGGT ATGTTATGGC 312 0 

TATTTTAAAT TGTCCCTAAT TCGTTGCTGA GCAAACATGT TGCTGTTTCC AGTTCCGTTC 3180 

TGAGAGAAAA AGAGAGAGAG AGAGAAAAAG ACCATGCTGC ATACATT CTG TAATACATAT 3240 

5 CATGTACAGT TTTATTTTAT AACGTGAGGA GGAAAAACAG TCTTTGGATT AACCCTCTAT 33 00 

AGACAGAATA GATAGCACTG AAAAAAAATC TCTATGAGCT AAATGTCTGT CTCTAAAGGG 33 60 

TTAAATGTAT CAATTGGAAA GGAAGAAAAA AGGCCTTGAA TTGACAAATT AACAGAAAAA 3420 

CAGAACAAGT TTATTCTATC ATTTGGTTTT AAAATATGAG TGCCTTGGAT CTATTAAAAC 3480 

CACAT CGATG GTTCTTTCTA CTTGTTATAA ACTTGTAGCT TAATTCAGCA TTGGGTGAGG 3540 

10 TAATAAACCT TAGGAACTAG CATATAATTC TATATTGTAT TTCTCACAAC AATGGCTACC 3600 

TAAAAAGATG ACCCATTATG TCCTAGTTAA TCATCATTTT TCCTTTAGTT TAATTTTATA 3660 

AACAAAACTG ATTATACCAG TATAAAAGCT ACTTTGCTCC TGGTGAGAGC TTAAAAGAAA 3720 

TGGGCTGTTT TGCCCAAAGT TTTATTTTTT TTAAACAATG ATTAAATTGA ATGTGTAATG 3780 

TGCAAAAGCC CTGGAACGCA ATTAAATACA CTAGTAAGGA GTTCATTTTA TGAAGATATT 3840 

15 TGCTTTAATA ATGTCTTTTT AAAAATACTG GCACCAAAAG AAATAGATCC AGATCTACTT 3900 

GGTTGTCAAG TGGACAATCA AATGATAAAC TTTAAGACCT TGTATACCAT ATTGAAAGGA 3 960 

AGAGGCTGAC AATAAGGTTT GACAGAGGGG AACAGAAGAA AATAATATGA TTTATTAGCA 4020 

CAACGTGGTA CTATTTGCCA TTTAAAACTA GAACAGGTAT ATAAGCTAAT ATTGATACAA 4080 

TGATGATTAA CTATGAATTC TTAAGACTTG CATTTAAATG TGACATTCTT AAAAAAAGAA 4140 

20 GAGAAAGAAT TTTAAGAGTA GCAGTATATA TGTCTGTGCT CCCTAAAAGT TGTACTTCAT 4200 

TTCTTTTCCA TACACTGTGT GCTATTTGTG TTAACATGGA AGAGGATTCA TTGTTTTTAT 4260 

TTTTATTTTT TTAATTTTTT CTTTTTTATT AAGCTAGCAT CTGCCCCAGT TGGTGTTCAA 4320 

ATAGCACTTG ACTCTGCCTG TGATATCTGT ATCTTTTCTC TAATCAGAGA TACAGAGGTT 4380 

GAGTATAAAA TAAACCTGCT CAGATAGGAC AATTAAGTGC ACTGTACAAT TTTCCCAGTT 4440 

25 TACAGGTCTA TACTTAAGGG AAAAGT TGCA AGAATGCTGA AAAAAAATTG AACACAATCT 4500 

CATTGAGGAG CATTTTTTAA AAACTAAAAA AAAAAAAACT TTGCCAGCCA TTTACTTGAC 4560 

TATTGAGCTT ACTTACTTGG ACGCAACATT GCAAGCGCTG TGAATGGAAA CAGAATACAC 4620 

TTAACATAGA AATGAATGAT TGCTTTCGCT TCTACAGTGC AAGGATTTTT TTGTACAAAA 4680 

CTTTTTTAAA TATAAATGTT AAGAAAAATT TTTTTTAAAA AACACTTCAT TATGTTTAGG 4740 

30 GGGGAACTGC ATTTTAGGGT TCCATTGTCT TGGTGGTGTT ACAAGACTTG TTATCCATTT 4800 

AAAAATGGTA GTGGAAATTC TATGCCTTGG ATACACACCG CTCTTCAGGT TGTAAAAAAA 4860 

AAAAACATAC ATTGGGGAAA GGTTTAAGAT TATATAGTAC TTAAATATAG GAAAATGCAC 4920 

ACTCATGTTG ATTCCTATGC TAAAATACAT TTATGGTCTT TTTTCTGTAT TTCTAGAATG 4980 

GTATTTGAAT TAAATGTTCA TCTAGTGTTA GGCACTATAG TATTTATATT GAAGCTTGTA 5040 

35 TTTTTAACTG TTGCTTGTTC TCTTAAAAGG TATCAATGTA CCTTTTTTGG TAGTGGAAAA 5100 

AAAAAAGACA GGCTGCCACA GTATATTTTT TTAATTTGGC AGGATAATAT AGTGCAAATT 5160 

ATTTGTATGC TTCAAAAAAA AAAAAAAGAG AGAAACAAAA AAGTGTGACA TTACAGATGA 5220 

GAAGCCATAT AATGGCGGTT TGGGGGAGCC TGCTAGAATG TCACATGGAT GGCTGTCATA 52 80 

GGGGTTGTAC ATATCCTTTT TTGTTCCTTT TTCCTGCTGC CATACTGTAT GCAGTACTGC 5340 

40 AAGCTAATAA CGTTGGTTTG TTATGTAGTG TGCTTTTTGT CCCTTTCCTT CTATCACCCT 54 00 

ACATTCCAGC ATCTTACCTT CATATGCAGT AAAAGAAAGA AAGAAAAAAA AAGGAAAAAA 5460 

AAAAAAAAAC CAATGTTTTG CAGTTTTTTT CATTGCCAAA AACTAAATGG TGCTTTATAT 5520 

TTAGATTGGA AAGAATTTCA TATGCAAAGC ATATTAAAGA GAAAGCCCGC TTTAGTCAAT 5580 

ACTTTTTTGT AAATGGCAAT GCAGAATATT TTGTTATTGG CCTTTTCTAT TCCTGTAATG 5640 

45 AAAGCTGTTT GTCGTAACTT GAAATTTTAT CTTTTACTAT GGGAGTCACT ATTTATTATT 5700 

GCTTATGTGC CCTGTTCAAA ACAGAGGCAC TTAATTTGAT CTTTTATTTT TCTTTGTTTT 5760 

TATTTTTTTT TTTATTTAGA TGACCAAAGG TCATTACAAC CTGGCTTTTT ATTGTATTTG 5820 

TTTCTGGTCT TTGTTAAGTT CTATTGGAAA AACCACTGTC TGTGTTTTTT TGGCAGTTGT 5880 

CTGCATTAAC CTGTTCATAC ACCCATTTTG TCCCTTTATT GAAAAAATAA AAAAAATTAA 5940 

50 a 

Seq ID NO: 85 Protein sequence: 
Protein Accession #: NP_075044.1 

55 1 11 21 31 41 51 

I ] I I i I 

MSRRKQGKPQ HLSKREFSPE PLEAI LTDDE PDHGPLGAPE GDHDLLTCGQ CQMNFPLGDI 60 

LIFIEHKRKQ CNGSLCLEKA VDKPPSPSPI EMKKASNPVE VGIQVTPEDD DCLSTSSRRI 120 

CPKQEHIADK LLHWRGLSSP RSAHGALIPT PGMSAEYAPQ GICKDEPSSY TCTTCKQPFT 180 

60 SAWFLLQHAQ NTHGLRIYLE SEHGSPLTPR VGIPSGLGAE CPSQPPLHGI HIADNNPFNL 240 

LRIPGSVSRE ASGLAEGRFP PTPPLFSPPP RHHLDPHRIE RLGAEEMALA THHPSAFDRV 300 

LRLNPMAMEP PAMDFSRRLR ELAGNTSSPP LSPGRPSPMQ RLLQPFQPGS KPPFLATPPL 360 

PPLQSAPPPS QPPVKSKSCE FCGKTFKFQS NLWHRRSHT GEKPYKCNLC DHACTQASKL 42 0 

KRHMKTHMHK SSPMTVKSDD GLSTASSPEP GTSDLVGSAS SALKSWAKF KSENDPNLIP 4 80 

65 ENGDEEEEED DEEEEEEEEE EEEELTESER VDYGFGLSLE AARHHENSSR GAWGVGDES 540 

RALPDVMQGM VLSSMQHFSE AFHQVLGEKH KRGHLAEAEG HRDTCDEDSV AGESDRIDDG 60 0 

TVNGRGCSPG ESASGGLSKK LLLGSPSSLS PFSKRIKLEK EFDLPPATMP NTENVYSQWL 660 

AGYAASRQLK DPFLSFGDSR QSPFASSSEH SSENGSLRFS TPPGELDGGI SGRSGTGSGG 720 

STPHISGPGT GRPSSKEGRR SDTCEYCGKV FKNCSNLTVH RRSHTGERPY KCELCNYACA 780 
70 QSSKLTRHMK THGQVGKDVY KCEICKMPFS VYSTLEKHMK KWHSDRVLNN DIKTE 



Seq ID NO: 86 DNA sequence 
Nucleic Acid Accession #: XM_035292.2 
75 Coding sequence: 53-1576 

1 11 21 31 41 51 

I I I I ) ) 

GCTCGCTGGG CCGCGGCTCC CGGGTGTCCC AGGCCCGGCC GGTGCGCAGA GCATGGCGGG 60 

80 TGCGGGCCCG AAGCGGCGCG CGCTAGCGGC GCCGGCGGCC GAGGAGAAGG AAGAGGCGCG 120 

GGAGAAGATG CTGGCCGCCA AGAGCGCGGA CGGCTCGGCG CCGGCAGGCG AGGGCGAGGG 180 

CGTGACCCTG CAGCGGAACA TCACGCTGCT CAACGGCGTG GCCATCATCG TGGGGACCAT 240 

TATCGG CTCG GGCATCTTCG TGACGCCCAC GGGCGTGCTC AAGGAGGCAG GCTCGCCGGG 300 

GCTGGCGCTG GTGGTGTGGG CCGCGTGCGG CGTCTTCTCC ATCGTGGGCG CGCTCTGCTA 360 

85 CGCGGAGCTC GGCACCACCA TCTCCAAATC GGGCGGCGAC TACGCCTACA TGCTGGAGGT 420 

CTACGGCTCG CTGCCCGCCT TCCTCAAGCT CTGGATCGAG CTGCTCATCA TCCGGCCTTC 480 

ATCGCAGTAC AT CGTGGCCC TGGTCTTCGC CACCTACCTG CTCAAGCCGC TCTTCCCCAC 540 
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CTGCCCGGTG CCCGAGGAGG CAGCCAAGCT CGTGGCCTGC CTCTGCGTGC TGCTGCTCAC 600 

GGCCGTGAAC TGCTACAGCG TGAAGGCCGC CACCCGGGTC CAGGATGCCT TTGCCGCCGC 660 

CAAGCTCCTG GCCCTGGCCC TGAT CATCCT GCTGGGCTTC GTCCAGATCG GGAAGGGTGA 720 

TGTGTCCAAT CTAGATCCCA ACTTCTCATT TGAAGGCACC AAACTGGATG TGGGGAACAT 780 

TGTGCTGGCA TTATACAGCG GCCTCTTTGC CTATGGAGGA TGGAATTACT TGAATTTCGT 840 

CACAGAGGAA ATGATCAACC CCTACAGAAA CCTGCCCCTG GC CAT CATC A TCTCCCTGCC 900 

CATCGTGACG CTGGTGTACG TGCTGACCAA CCTGGCCTAC TTCACCACCC TGTCCACCGA 960 

GCAGATGCTG TCGTCCGAGG CCGTGGCCGT GGACTT CGGG AACTATCACC TGGGCGT CAT 1020 

GTCCTGGATC ATCCCCGTCT TCGTGGGCCT GTCCTGCTTC GGCTCCGTCA ATGGGTCCCT 108 0 

GTTCACATCC TCCAGGCTCT TCTTCGTGGG GTCCCGGGAA GGCCACCTGC CCTCCATCCT 1140 

CTCCATGATC CACCCACAGC TCCTCACCCC CGTGCCGTCC CTCGTGTTCA CGTGTGTGAT 12 00 

GACGCTGCTC TACGCCTTCT CCAAGGACAT CTTCTCCGTC ATCAACTTCT TCAGCTTCTT 12 60 

CAACTGGCTC TGCGTGGCCC TGGCCATCAT CGGCATGATC TGGCTGCGCC ACAGAAAGCC 132 0 

TGAGCTTGAG CGGCCCATCA AGGTGAACCT GGCCCTGCCT GTGTTCTTCA TCCTGGCCTG 13 80 

CCTCTTCCTG ATCGCCGTCT CCTTCTGGAA GACACCCGTG GAGTGTGGCA TCGGCTTCAC 1440 

CATCATCCTC AGCGGGCTGC CCGTCTACTT CTTCGGGGTC TGGTGGAAAA ACAAGCCCAA 150 0 

GTGGCTCCTC CAGGGCAT CT TCTCCACGAC CGTCCTGTGT CAGAAGCTCA TGCAGGTGGT 1560 
CCCCCAGGAG ACATAGCCAG GAGGCCGAGT GGCTGCCGGA GGAGCATGC 

Seq ID NO: 87 Protein sequence: 
Protein Accession #: XP_0352 92.2 
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MAGAGPKRRA 
GTIIGSGIFV 
LEVYGSLPAF 
LLTAVNCYSV 
GNIVLALYSG 
STEQMLSSEA 
SILSMIHPQL 
RKPELERPIK 
KPKWLLQGIF 



11 
I 

LAAPAAEEKE 
TPTGVLKEAG 
LKLWIELDII 
KAATRVQDAF 
LFAYGGWNYL 
VAVDFGNYHL 
LTPVPSLVFT 
VNLALPVFFI 
STTVLCQKLM 



21 

I 

EAREKMLAAK 
SPGLAIiVWA 
RPSSQYIVAL 
AAAKLLALAL 
NFVTEEMINP 
GVMSWIIPVF 
CVMTIiLYAFS 
LACLFLIAVS 
QWPQET 



31 

I 

SADGSAPAGE 
ACGVFSIVGA 
VFATYLIiKPL 
IILLGFVQIG 
YRNLPLAI I I 
VGLSCFGSVN 
KDIFSVINFF 
FWKTPVECGI 



41 
I 

GEGVTLQRNI 
LCYAELGTTI 
FPTCPVPEEA 
KGDVSNLDPN 
SLPIVTLVYV 
GSLFTSSRLF 
SFFNWLCVAL 
GFTIILSGLP 



51 
I 

TLLNGVAI IV 
SKSGGDYAYM 
AKLVACLCVL 
FSFEGTKLDV 
LTNLAYFTTL 
FVGSREGHLP 
AIIGMIWLRH 
VYFFGVWWKN 



Seq ID NO: 88 DNA sequence 

Nucleic Acid Accession #: NM_005268.1 

Coding sequence: 168-989 



TAAAAAGCAA 
TCTGGATATG 
AGCCCTGAGG 
TCTTTGAGGG 
TGTCTCTGGT 
GTGATGACCA 
TTGATGAGTT 
CATGCCCCTC 
ACCGAGAAGC 
GTGGGCTCTG 
TTCTCTATGT 
ACGCAGATCC 
TTTTCACCCT 
TCATCTACCT 
TGTGCACAGG 
CGGGTGACCT 
GAGACCATGT 
CCTGGATGGG 
CATGAGGTAG 
TCAACTCCAG 
GCTCGGTTTC 



11 

I 

AAGAATTCGC 
AAATTCAAGC 
AGTAGTCACT 
ACTCCTGAGT 
CTTCATCTTC 
CAAGGACTTC 
CTTCCCTGTG 
ACTGCTCGTG 
CCATGGGGAG 
GTGGACATAT 
GTTCCACTCA 
ATGTCCCAAT 
CTTCATGGTG 
GGTGAGCAAG 
TCATCACCCC 
CATCTTTCTG 
GAAGAAAACC 
GAGGCTCTAG 
GGGCAGG CAA 
CCACCTGCCC 
CTTTTCTAGA 



21 
I 

GGCCGCGTCG 
TGCTTGCTGA 
CAGTAGCAGC 
GGGGTCAACA 
CGCGTGCTGG 
GACTGCAATA 
TCCCATGTGC 
GTCATGCACG 
AACAGTGGGC 
GTCTGCAGCC 
TTCTACCCCA 
ATAGTGGACT 
GCCACAGCTG 
AGATGCCACG 
CACGGTACCA 
GGCT CAGACA 
ATCTTGTGAG 
CATCTCTCAT 
GAGAGAGGAT 
CAGCTCGACG 
ATGGAAATAG 



31 
I 

ACACGGGCTT 
GTCCTATTGC 
TGACGCGTGG 
AGTACTCCAC 
TGTACCTGGT 
CTCGCCAGCC 
GCCTCTGGGC 
TGGCCTACCG 
GCCTCTACCT 
TAGTGTTCAA 
AATATATCCT 
GCTTCATCTC 
CCATCTGCAT 
AGTGCCTGGC 
CCTCTTCCTG 
GTCATCCTCC 
GGGCTGCCTG 
AGGTGCAACC 
TCAGACGCTC 
GCACTGGGCC 
TGAGGGCCAA 



41 
I 

CCCCGAAAAC 
CGGCTGCTGG 
GTCCACCATG 
AGCCTTTGGG 
GACGGCCGAG 
CGGCTGCTCC 
CCTGCAGCTT 
GGAGGTT GAG 
GAACCCCGGC 
GGCGAGCGTG 
CCCTCCTGTG 
CAAGCCCTCA 
CCTGCTCAAC 
AG CAAGGAAA 
CAAACAAGAC 
TCTCTTACCA 
GACTGGTCTG 
TGAGAGTGGG 
TGGGAGCCAG 
AGTTCCCCCT 
TGC 



51 
1 

CTTCCCCGCT 
GAGCCAGGAG 
AACTGGAGTA 
CGCATCTGGC 
CGTGTGTGGA 
AACGTCTGCT 
ATCCTGGTGA 
GAGAAGAGGC 
AAGAAGCGGG 
GACATCGCCT 
GTCAAGTGCC 
GAGAAGAACA 
CTCGTGGAGC 
GCTCAAGCCA 
GACCTCCTTT 
GACCGCCCCC 
GCAGGTTGGG 
GGAGCTAAGC 
TTCCTAGTCC 
CTGCTCTGCA 



60 
120 
180 
240 
300 
360 
420 
480 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



65 



Seq ID NO: 89 Protein sequence: 
Protein Accession #: NP 005259.1 



70 
75 
80 
85 



1 11 21 31 41 51 

I I I I I I 

MNWSIFEGLL SGVNKYSTAF GRIWLSLVFI FRVLVYLVTA ERVWSDDHKD FDCNTRQPGC 
SNVCFDEFFP VSHVRLWALQ LILVTCPSLL WMHVAYREV QEKRHREAHG ENSGRLYLNP 
GKKRGGLWWT YVCSLVFKAS VDIAFLYVFH SFYPKYILPP WKCHADPCP NIVDCFISKP 
SEKNIFTLFM VATAAICILL NLVELIYLVS KRCHECLAAR KAQAMCTGHH PHGTTSSCKQ 
DDLLSGDLIF LGSDSHPPLL PDRPRDHVKK TIL 



Seq ID NO: 90 DNA sequence 

Nucleic Acid Accession #: NM_0023 91.1 

Coding sequence: 26-457 



CGGGCGAAGC 
CGCCCTGCTG 
CCCGGGGAGC 
CGGCGTGGGT 
GCCCTGCAAC 
TGCGTGTGAT 



11 
I 

AGCGCGGGCA 
GCGCTCACCT 
GAGTGCGCTG 
TTCCGCGAGG 
TGGAAGAAGG 
GGGGGCACAG 



21 
I 

GCGAGATGCA 
CCGCGGTCGC 
AGTGGGCCTG 
GCACCTGCGG 
AGTTTGGAGC 
GCACCAAAGT 



31 
I 

GCACCGAGGC 
CAAAAAGAAA 
GGGGCCCTGC 
GGCCCAGACC 
CGACTGCAAG 
CCGCCAAGGC 



41 

] 

TTCCTCCTCC 
GATAAGGTGA 
ACCCCCAGCA 
CAGCGCATCC 
TACAAGTTTG 
ACCCTGAAGA 



51 

I 

TCACCCTCCT 
AGAAGGGCGG 
GCAAGGATTG 
GGTGCAGGGT 
AGAACTGGGG 
AGGCGCGCTA 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 



222 



WO 02/086443 

CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAGC 
AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 
GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 
CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 
ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 
TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 
ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 
TAATAT 

Seq ID NO: 91 Protein sequence: 
Protein Accession #: NP_002382.1 

1 11 21 31 41 51 

I I I 1 I I 

MQHRGFLLLT LL ALL ALTS A VAKKKDKVKK GGPGSECAEW AWGPCTPSSK DCGVGFREGT 
CGAQTQR IRC RVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKKA RYNAQCQETI 
RVTKPCTPKT KAKAKAKKGK GKD 



Seq ID NO: 92 DNA sequence 

Nucleic Acid Accession #: NM_005130.1 

Coding sequence: 98-802 

1 11 21 31 41 51 

I 1 1 I i I 

CTCTACCTGA CACAGCTGCA GCCTGCAATT CACTCCCACT GCCTGGGATT GCACTGGATC 
CGTGTGCTCA GAACAAGGTG AACGCCCAGC TGCAGCCATG AAGATCTGTA GCCTCACCCT 
GCTCTCCTTC CTCCTACTGG CTGCTCAGGT GCTCCTGGTG GAGGGGAAAA AAAAAGTGAA 
GAATGGACTT CACAGCAAAG TGGT CTCAGA ACAAAAGGAC ACTCTGGGCA ACACCCAGAT 
TAAGCAGAAA AGCAGGCCCG GGAACAAAGG CAAGTTTGTC ACCAAAGACC AAGCCAACTG 
CAGATGGGCT GCTACTGAGC AGGAGGAGGG CATCTCTCTC AAGGTTGAGT GCACTCAATT 
GGACCATGAA TTTTCCTGTG TCTTTGCTGG CAATCCAACC TCATGCCTAA AGCTCAAGGA 
TGAGAGAGTC TATTGGAAAC AAGTTGCCCG GAATCTGCGC TCACAGAAAG ACAT CTGTAG 
ATATTCCAAG ACAGCTGTGA AAACCAGAGT GTGCAGAAAG GATTTTCCAG AATCCAGTCT 
TAAGCTAGTC AGCTCCACTC TATTTGGGAA CACAAAGCCC AGGAAGGAGA AAACAGAGAT 
GTCCCCCAGG GAGCACATCA AGGGCAAAGA GACCACCCCC TCTAGCCTAG CAGTGACCCA 
GACCATGGCC ACCAAAGCTC CCGAGTGTGT GGAGGACCCA GATATGGCAA ACCAGAGGAA 
GACTGCCCTG GAGTTCTGTG GAGAGACTTG GAGCTCTCTC TGCACATTCT TCCTCAGCAT 
AGTGCAGGAC ACGTCATGCT AATGAGGTCA AAAGAGAACG GGTTCCTTTA AGAGATGTCA 
TGTCGTAAGT CCCTCTGTAT ACTTTAAAGC TCTCTACAGT CCCCCCAAAA TATGAACTTT 
TGTGCTTAGT GAGTGCAACG AAATATTTAA ACAAGTTTTG TATTTTTTGC TTTTGTGTTT 
TGGAATTTGC CTTATTTTTC TTGGATGCGA TGTT CAGAGG CTGTTTCCTG CAGCATGTAT 
TTCCATGGCC CACACAGCTA TGTGTTTGAG CAGCGAAGAG TCTTTGAGCT GAATGAGCCA 
GAGTGATAAT TTCAGTGCAA CGAACTTTCT GCTGAATTAA TGGTAATAAA ACTCTGGGTG 
TTTTTCAAAA AAAAAAAAAA AAA 

Seq ID NO: 93 Protein sequence: 
Protein Accession #: NP_005121.1 

1 11 21 31 41 51 

I 1 I I I 1 

MKICSLTLLS FLLLAAQVLL VEGKKKVKNG LHSKWSEQK DTLGNTQIKQ KSRPGNKGKF 
VTKDQANCRW AATEQEEGIS LKVECTQLDH EFSCVFAGNP TSCLKLKDER VYWKQVARNL 
RSQKDICRYS KTAVKTRVCR KDFPESSLKL VSSTLFGNTK PRKEKTEMSP REHIKGKETT 
PSSLAVTQTM ATKAPECVED PDMANQRKTA LEFCGETWSS LCTFFLSIVQ DTSC 

Seq ID NO: 94 DNA sequence 

Nucleic Acid Accession #: NM__012101 

Coding sequence: 125-18 91 

1 11 21 31 41 51 

I I l I I I 

CTCCTCACAG GTGTGTCTCT AGTCCTCGTG GTTGCCTGCC CCACTCCCTG CCGAGACGCC 
TGCCAGAAAG GTCACCTATC CTGAACCCCA GCAAGCCTGA AACAGCTCAG CCAAGCACCC 
TGCGATGGAA GCTG CAGATG CCTCCAGGAG CAACGGGTCG AGCCCAGAAG CCAGGGATGC 
CCGGAGCCCG TCGGGCCCCA GTGGCAGCCT GGAGAATGGC ACCAAGGCTG ACGGCAAGGA 
TGCCAAGACC ACCAACGGGC ACGGCGGGGA GGCAGCTGAG GGCAAGAGCC TGGG CAGCGC 
CCTGAAGCCA GGGGAAGGTA GGAGCGCCCT GTTCGCGGGC AATGAGTGGC GGCGACCCAT 
CATCCAGTTT GTCGAGTCCG GGGACGACAA GAACTCCAAC TACTTCAGCA TGGACTCTAT 
GGAAGGCAAG AGGTCGCCGT ACGCAGGGCT CCAGCTGGGG GCTGCCAAGA AGCCACCCGT 
TACCTTTGCC GAAAAGGGCG ACGTGCGCAA GTCCATTTTC TCGGAGTCCC GGAAGCCCAC 
GGTGTCCATC ATGGAGCCCG GGGAGACCCG GCGGAACAGC TACCCCCGGG CCGACACGGG 
CCTTTTTTCA CGGTCCAAGT CCGGCTCCGA GGAGGTGCTG TGCGACTCCT GCATCGGCAA 
CAAGCAGAAG GCGGTCAAGT CCTGCCTGGT GTGCCAGGCC TCCTTCTGCG AGCTGCATCT 
CAAGCCCCAC CTGGAGGGCG CCGCCTTCCG AGACCACCAG CTGCTCGAGC CCATCCGGGA 
CTTTGAGGCC CGCAAGTGTC CCGTGCATGG CAAGACGATG GAGCTCTTCT GCCAGACCGA 
CCAGACCTGC ATCTGCTACC TTTGCATGTT CCAGGAGCAC AAGAATCATA GCACCGTGAC 
AGTGGAGGAG GCCAAGGCCG AGAAGGAGAC GGAGCTGTCA CTGCAAAAGG AGCAGCTGCA 
GCTCAAGATC ATTGAGATTG AGGATGAAGC TGAGAAGTGG CAGAAGGAGA AGGACCGCAT 
CAAGAG CTTC ACCACCAATG AGAAGGCCAT CCTGGAGCAG AACTTCCGGG ACCTGGTGCG 
GGACCTGGAG AAGCAAAAGG AGGAAGTGAG GGCTGCGCTG GAGCAGCGGG AGCAGGATGC 
TGTGGACCAA GTGAAGGTGA TCATGGATGC TCTGGATGAG AGAGCCAAGG TGCTGCATGA 
GGACAAGCAG ACCCGGGAGC AGCTGCATAG CATCAGCGAC TCTGTGTTGT TTCTGCAGGA 
ATTTGGTGCA TTGATGAGCA ATTACTCTCT CCCCCCACCC CTGCCCACCT ATCATGTCCT 
GCTGGAGGGG GAGGGCCTGG GACAGTCACT AGGCAACTTC AAGGACGACC TGCTCAATGT 
ATGCATGCGC CACGTTGAGA AGATGTGCAA GGCGGACCTG AGCCGTAACT TCATTGAGAG 
GAACCACATG GAGAACGGTG GTGACCATCG CTATGTGAAC AACTACACGA ACAGCTTCGG 



223 



WO 02/086443 

GGGTGAGTGG AGTGCACCGG ACACCATGAA GAGATACTCC ATGTACCTGA CACCCAAAGG 1560 

TGGGGTCCGG ACATCATACC AGCCCTCGTC TCCTGGCCGC TTCACCAAGG AGACCACCCA 1620 

GAAGAATTTC AACAATCTCT ATGGCACCAA AGGTAACTAC ACCTCCCGGG T CTGGGAGTA 1680 

CTCCTCCAGC ATTCAGAACT CTGACAATGA CCTGCCCGTC GTCCAAGGCA GCTCCTCCTT 1740 

CTCCCTGAAA GGCTATCCCT CCCTCATGCG GAGCCAAAGC CCCAAGGCCC AGCCCCAGAC 1800 

TTGGAAATCT GGCAAGCAGA CTATGCTGTC TCACTACCGG CCATTCTACG TCAACAAAGG 1860 

CAACGGGATT GGGTCCAACG AAGCCCCATG AGCTCCTGGC GGAAGGAACG AGGCGCCACA 1920 

CCCCTGCTCT TCCTCCTGAC CCTGCTGCTC TTGCCTTCTA AGCTACTGTG CTTGTCTGGG 1980 

TGGGAGGGAG CCTGGTCCTG CACCTGCCCT CTGCAGCCCT CTGCCAGCCT CTTGGGGGCA 2040 

GTTCCGGCCT CTCCGACTTC CCCACTGGCC ACACTCCATT CAGACTCCTT TCCTGCCTTG 2100 

TGACCTCAGA TGGTCACCAT CATTCCTGTG CTCAGAGGCC AACCCATCAC AGGGGTGAGA 2160 

TAGGTTGGGG CCTGCCCTAA CCCGCCAGCC TCCTCCTCTC GGGCTGGATC TGGGGGCTAG 2220 

CAGTGAGTAC CCGCATGGTA TCAGCCTGCC TCTCCCGCCC ACGCCCTGCT GTCTCCAGGC 2280 

CT AT AG AC GT TTCTCTCCAA GGCCCTATCC CCCAATGTTG TCAGCAGATG CCTGGACAGC 2340 

ACAGCCACCC ATCTCCCATT CACATGGCCC ACCTCCTGCT TCCCAGAGGA CTGGCCCTAC 2400 

GTGCTCTCTC TCGTCCTACC TATCAATGCC CAGCATGGCA GAACCTGCAG TGGCCAAGGG 2460 

CTGCAGATGG AAACCTCTCA GTGTCTTGAC ATCACCCTAC CCAGGCGGTG GGTCTCCACC 2520 

ACAGCCACTT TGAGTCTGTG GTCCCTGGAG GGTGGCTTCT CCTGACTGGC AGGATGACCT 2580 

TAG C C AAG AT ATTCCTCTGT TCCCTCTGCT GAGATAAAGA ATTCCCTTAA CATGATATAA 2 640 

TCCACCCATG CAAATAGCTA CTGGCCCAGC TACCATTTAC CATTTGCCTA CAGAATTTCA 2 700 

TTCAGTCTAC ACTTTGGCAT TCTCTCTGGC GATGGAGTGT GGCTGGGCTG ACCGCAAAAG 2 760 

GTGCCTTACA CACTGCCCCC ACCCTCAGCC GTTGCCCCAT CAGAGGCTGC CTCCTCCTTC 2 820 

TGATTACCCC CCATGTTGCA TAT CAGGGTG CT CAAGGATT GGAGAGGAGA CAAAACCAGG 2 880 

AGCAGCACAG TGGGGACATC TCCCGTCTCA ACAGCCCCAG GCCTATGGGG GCTCTGGAAG 2 940 

GATGGGCCAG CTTGCAGGGG TTGGGGAGGG AGACATCCAG CTTGGGCTTT CCCCTTTGGA 3 000 
ATAAACCATT GGTCTGTC 

Seq ID NO: 95 Protein sequence: 
Protein Accession #: NP_036233.1 



1 11 21 31 41 51 

MEAADASRSN GSSPEARDAR SPSGPSGSLE NGTKADGKDA KTTNGHGGEA AEGKSLGSAL 60 

KPGEGRSAIiF AGNEWRRP I I QFVESGDDKN SNYFSMDSME GKRSPYAGLQ LGAAKKPPVT 120 

FAEKGDVRKS IFSESRKPTV SIMEPGETRR NSYPRADTGL FSRSKSGSEE VLCDSCIGNK 180 

QKAVKSCLVC QASFCELHLK PHLEGAAFRD HQLLEPIRDF EARKCPVHGK TMELFCQTDQ 240 

TCICYLCMFQ EHKNHSTVTV EEAKAEKETE LSLQKEQLQL KIIEIEDEAE KWQKEKDRIK 300 

SFTTNEKAIL EQNFRDLVRD LEKQKEEVRA ALEQREQDAV DQVKVIMDAL DERAKVLHED 360 

KQTREQLHSI SDSVLFLQEF GALMSNYSLP PPLPTYHVLL EGEGLGQSLG NFKDDLLNVC 420 

MRHVEKMCKA DLSRNFIERN HMENGGDHRY VNNYTNSFGG EWSAPDTMKR YSMYLTPKGG 480 

VRTSYQPSSP GRFTKETTQK NFNNLYGTKG NYTSRWEYS SSIQNSDNDL PWQGSSSFS 540 
LKGYPSLMRS QSPKAQPQTW KSGKQTMLSH YRPFYVNKGN GIGSNEAP 

Seq ID NO: 96 DNA sequence 

Nucleic Acid Accession # NM_080668.1 

Coding sequence: 83-841 

1 11 21 31 41 51 

GGCACGAGGG CAGCGAGTGG CCTTCCCGGT TGGCGCGCGC CCGGGGCGGC GGCGCTGGAG 60 

GAGCTCGAGA CGGAG CCTAG TTATGTCTGG GAGGCGAACG CGGT CCGGAG GAGCCGCTCA 120 

GCGCTCCGGG CCAAGGGCCC CATCTCCTAC TAAGCCTCTG CGGAGGTCCC AGCGGAAATC 180 

AGGCTCTGAA CTCCCGAGCA TCCTCCCTGA AATCTGGCCG AAGACACCCA GTGCGGCTGC 240 

AGTCAGAAAG CCCATCGTCT TAAAGAGGAT CGTGGCCCAT GCTGTAGAGG TCCCAGCTGT 300 

CCAATCACCT CGCAGGAGCC CT AGGATTT C CTTTTTCTTG GAGAAAGAAA ACGAGCCCCC 360 

TGGCAGGGAG CTTACTAAGG AGGACCTTTT CAAGACACAC AGCGTCCCTG CCACCCCCAC 420 

CAGCACTCCT GTGCCGAACC CTGAGGCCGA GTCCAGCTCC AAGGAAGGAG AGCTGGACGC 480 

CAGAG ACT TG GAAATGTCTA AGAAAGTCAG GCGTTCCTAC AGCCGGCTGG AGACCCTGGG 540 

CTCTGCCTCT ACCTCCACCC CAGGCCGCCG GTCCTGCTTT GGCTTCGAGG GGCTGCTGGG 60 0 

GGCAGAAGAC TTGT CCGGAG TCTCGCCAGT GGTGTGCTCC AAACT CACCG AGGTCCCCAG 660 

GGTTTGTGCA AAGCCCTGGG CCCCAGACAT GACTCTCCCT GGAATCTCCC CACCACCCGA 72 0 

GAAACAGAAA CGTAAGAAGA AGAAAATGCC AGAGATCTTG AAAACGGAGC TGGATGAGTG 780 

GGCTGCGGCC ATGAATGCCG AGTTTGAAGC TGCTGAGCAG TTTGATCTCC TGGTTGAATG 840 

AGATGCAGTG GGGGGTGCAC CTGGCCAGAC TCTCCCTCCT GTCCTGTACA TAGCCACCTC 900 

CCTGTGGAGA GGACACTTAG GGTCCCCTCC CCTGGTCTTG TTACCTGTGT GTGTGCTGGT 960 

GCTGCGCATG AGGACTGTCT GCCTTTGAGG GCTTGGGCAG CAGCGGCAGC CATCTTGGTT 1020 

TTAGGAAATG GGGCCGCCTG GCCCAGCCAC TCACTGGTGT CCTGTCTCTT GTCGTCCTGT 1080 

CCTTCCTATC TCCCCAAAGT ACCATAGCCA GTTTCCAGAT GGGCCACAGA CTGGGGAGGA 1140 

GAATCAGTGG CCCAGCCAGA AGTTAAAGGG CTGAGGGTTG AGGTGAGAGG CACCTCTGCT 1200 

CTTGTTGGGA GGGGTGGCTG CTTGGAAATA GGCCCAGGGG CTCTGCCAGC CTCGGCCTCT 1260 

CCCTCCTGAG TTGCCTTCTG TTGGTGGCTT TCTTCTTGAA CCCACCTGTG TAAAGAGGTT 1320 

TTCAGTTCCG TGGGTTTCCC CTTTGATTCT GTAAATAGTC CCAGAGAGAA TTCGTGGGCT 13 8 0 

GAGGGCAATT CTGTCTTGGA GG AAG AAG CT GGACATTCAG CCTGTGGAGT CTGAGTTTTG 1440 

AAGGATGTAG GGAGCCTTAG TTGGGTCTCA GACCATAAGT GTGTACTACA CAGAAGCTGT 1500 

GTTTTCTAGT TCTGGTCTGC TGTTGAGATG TTTGGTAAAT GCCAGGTTGA TAGGGCGCTG 1560 

GCTGCTTGGA GCAAAGGGTG CATTTCAGGG TGTGGCCACC AGGTGCTGTG AGTTTCTGTG 162 0 

GCTCATGGCC TCTGGGCTGG TCCCTTGCAC AGGGCCCACG CTGGAGTCTT ACCACTCTGC 1680 

TGCAGGGGTG GAAGGTGGCC CCTCTTGTCA CCCATACCCA TTTCTTACAA AATAAGTTAC 1740 

ACCGAGTCTA CTTGGCCCTA GAAGAGAAAG TTGAAGAGTC CCAGACCTAC TAGCATTTTG 1800 

CAACTATGCT TGTAAAGTCC TCGGAAAGTT TCCTCGCGTA CCAGACAGCG GCGGGGGCTG I860 

ATAGCAATTT TAGTTTTTGG CCTCCCTATC * CTCTCACATG AGAACACTGC CTGGATGCAT 1920 

CTCATGATCT CTGGAGAATT TCCCCATCTT TCTCTTCTTT CCATCGTGTG GATTCAATAG 1980 

TTTGGATTTG AAGGCTGCCC TGCCCCCGAC TCTCCTGCCG CACCCCTGGC CATTGTACCT 2 040 

TTTGATGTTT AGAAGTTCGT GGAAGTAGAC GCTGAGGTGT GCAGAGGAGC TGGT GGATAA 2100 

CAGAGAATGC CAGGGAAGAT GAGTGCTGGG TCAGGGTACT TGGATGAAAC GGTGCAGGCC 2160 

AGGCGGGCCC TAATAAAACC CTCTGCCAGG TCTGGGAGTC CCAGGCCATC TGCTCAACGC 222 0 



224 



WO 02/086443 

TCTGTGGTTT GTCAGACCTG 
CTGAACCGCA CTGAAGAACT 
GTCTTAGTCC TGCAGAATCA 
AGTTCTCTGT TCCTGAGGAA 
AAAAAAAGCC TGATTAAAGA 



PCT/US02/12476 



CAAGCAAGCC CCCTGCTGGG GAAGCCTAGG TGTCCTTGAG 
CTTGTCCTCA CTGGCTGATG CAGCAGAACT CTTGGGAAAT 
GGAGTCACCA GATGATGCAG AGTTGAGATC ATCATTGCAA 
CTAAATTTAA GGAAAAAATG GGATTTTGTT TTAGAGTTGG 
GTTTCTGCCT GTTAAAAAAA AAAAAAAAAA AAAAAA 



2280 
2340 
2400 
2460 



10 

1-5 
20 
25 
30 
35 
40 
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Seq ID NO: 97 Protein sequence: 
Protein Accession #: NP_5423 99.1 

21 



31 



41 



51 



1 11 

I I ! I 1 ] 

MSGRRTRSGG AAQRSGPRAP SPTKPLRRSQ RKSGSELPSI LPEIWPKTPS AAAVRKPIVL 

KRIVAHAVEV PAVQSPRRSP RISPFLEKEN EPPGRELTKE DLFKTHSVPA TPTSTPVPNP 

EAESSSKEGE LDARDLEMSK KVRRSYSRLE TLGSASTSTP GRRSCFGFEG LLGAEDLSGV 

SPWCSKLTE VPRVCAKPWA PDMTLPGISP PPEKQICRKKK KMPEILKTEL DEWAAAMNAE 
FEAAEQFDLL VE 



Seq ID NO: 98 DNA sequence 
Nucleic Acid Accession #: Eos 
Coding sequence: 58-12444 



sequence 



GGGGCATTTC 
GCGGGCTCCG 
GCGGACCGCT 
TGCGTCCTGA 
GATTTCGGTT 
AGAGAAGAAA 
CCTTACTCTG 
AAATGTAAAA 
AGACT CATGG 
TTGAAAAAAA 
GGTGAAGTTC 
GGTGAACTTA 
GGATGTCTGA 
CCCCAGACTT 
CTGAAGAGAT 
TTTAGCACCT 
GCCCACACAA 
CAGGTTTCTA 
ATGGAGCAGT 
GCTATCCGTG 
GACTTCATGT 
ACTGGTGACG 
CTGTACCTTG 
CAGATAGACA 
GTGAAGGTGT 
GTGGTGCATC 
GAGT CTGAAT 
CCCACATACA 
GATTCTATTT 
CATTTACTTT 
ACACTTGAAA 
ATGATCCCAA 
GCTTTCATTA 
TTTGAACCAT 
CTCATCAGTG 
TATTTCGAGG 
TCTTGCTTTG 
AAAGATGAAC 
GAACTCGATG 
TATACCCCCT 
AGACATGTAA 
ACTTCAGCCT 
GCCCAGAAAG 
T CAAACGAAG 
CTAGGAGGAC 
AGCTATGTGG 
AAACCTGTCA 
AGTGACAGAC 
TTGGGCAAAG 
TATAAG CGG A 
CAACTGTATG 
GAAAGTCAGG 
GACAGTACTT 
AAGCAAATAA 
CGACTTTATA 
TTTAATAATA 
GAAGCCTTGG 
GGTACAATTC 
CATGTTTCTT 
TCATTGTGTT 
GAATGTCGAC 
AGATCCCCTA 
AACACCTTTG 
TTGTACCTTC 



11 
I 

CGGGTCCGGG 
GAGCCGGTGT 
GCGGTGCTGC 
GCAGCAGCCC 
TGCTTGTATT 
TCCTAAAGTT 
TTGAAATTAA 
TTCCAGCCCT 
ATGAATTTAA 
AAATACCAGA 
ATCCTAGTGA 
AGACCCAGAT 
AGGGGTTGTC 
CAAGGGAGAT 
ATGCTGTGCC 
GCCTTCTGGA 
ATGTAGAATT 
ATATGGTGGC 
TTTATGGAAT 
GATATGGACT 
ACGTTGAGCT 
ACCGTGTTTA 
ACACAGTTCC 
GTTTCCCACA 
TCCTAGCTTT 
AGGGTTTAAT 
CTGAAGACCA 
AAGACTACGT 
TAGCAGATGA 
ATGATGAATT 
TACAGACTGT 
CTTCAGATCC 
ACCTGGTGGA 
GGGTGTACTC 
GTTTCTACAA 
GAGTTAGTCC 
CTTTATTTGT 
TTTTGGCCTC 
TTAGAGCCTA 
TGGCAGAAGT 
TGCAGCCTTA 
TGT CAGATG A 
GATTTAATAA 
CAATATCCTT 
AAATAAACAA 
CCTGGGACAG 
TTTTCCTGGA 
AAACTAAAGT 
CCACGCAGAT 
CGTTTCCTGT 
AGCCACTAGT 
ATACTGTTGC 
TAAGAGATTT 
CACCACAGCA 
GCCTTGCGCT 
TCTACAGGGA 
TGATATACAT 
AACAGTGTTG 
TAAATAAAGC 
TATTGGATCT 
ACAAATCCAT 
ATTTGTGGCT 
AGGGGGGTGG 
GGGGGCCATT 



21 
I 

CCGAGCGGGC 
GCGTTGCTCC 
CCTGGCCGGT 
CGCGGTGCTG 
TGT CCGGAAG 
TTTATGTATT 
GAACACTTGT 
GGACCTTCTT 
AATTGGAGAA 
TACAGTTTTA 
GATGATAAAT 
GACATCAGCA 
CTCACTTCTG 
TTTTAATTTT 
CTCAGCTGGC 
CAACTACGTG 
GAAAAAAGCT 
GAAAAATGCA 
CAT CAG AAAT 
TTTTGCAGGA 
CATTCAGCGC 
TCAGATGCCA 
TGAGGTGTAT 
GTACAGTCCA 
GGCAGCAAAA 
CAGAATATGT 
CCGTGCTTCA 
GGATCTCTTC 
AGCATTTTTC 
TGTAAAATCC 
TGGGGAACAA 
AGCGGCTAAC 
ATTTTGCAGA 
ATTTTCATAT 
ATTGCTTTCT 
AAAGAGTCTG 
GAAATTTGGC 
TTGTTTGACC 
CGTTCCTGCA 
AGGCCTGAAT 
TTACAAAGAC 
GACCAAGAAT 
AGTGGTGTTA 
AGAAGAAATA 
AAATCTTCTG 
AGAGAAGCGG 
TGTGTTCCTG 
TGCAGCCTGT 
GCCAGAAGGG 
GCTGCTTCGA 
TATGCAGCTG 
CTTACTAGAA 
TTGTGGTCGG 
GCAGGAGAAG 
TCACCCCAAT 
ATTCAGGGAA 
GGAGAGTCTG 
TGATGCCATT 
AAAGAAACGA 
GGT CAAGTGG 
TGAACTCTTT 
GAAAGATGTT 
CTGTGGCCAG 
CAGCCTGCAG 



31 

I 

GCACGCGCGG 
CTGCTGCGGC 
CATCAACTGA 
GCATTACAGA 
TCACTCAACA 
TTCTTAGAAA 
ACCAGTGTTT 
ATTAAGTTAC 
TTATTTAGTA 
GAAAAAGTAT 
AATGCAGAAA 
GTAAGAGAGC 
TGCAACTTCA 
GTACTAAAGG 
TTGCGCCTAT 
TCTCTATTTG 
GCACTTTCAG 
GAAATGCATA 
GTGGATTCGA 
CCGTGCAAGG 
TGCAAGCAGA 
AGCTTCCTCC 
ACTCCAGTTC 
AAAATG CAG C 
GGGCCAGTTC 
TCTAAACCAG 
GGGGAAGTCA 
AGACATCTCC 
TCTGTGAATT 
GTTTTGAAGA 
GAGAATGGAG 
TTGCATCCAG 
GAGATTCTCC 
GAATTAATTT 
ATTACAGTAA 
AAACACTCTC 
AAAGAGGTGG 
TTTCTTCTGT 
CTGCAGATGG 
GCTCTAGAAG 
ATTCTCCCCT 
AACTGGGAAG 
AAGCATCTGA 
AGAATTAGAG 
ACAGTCACGT 
CTGAGCTTTG 
CCTCGAGTCA 
GAACTTTTAC 
GGACAGGGAG 
CTTGCGTGTG 
ATTCACTGGT 
GCTATATTGG 
TGT ATT CGAG 
AGTCCAGTAA 
GCTTTCAAGA 
GAAGAGTCTC 
GCCTTAGCAC 
GATCACCTAT 
CGTTTGCCGC 
CTTTTAGCTC 
TATAAATTCG 
CTCAAGGAAG 
CCCTCGGGCA 
GCCACGCTAT 



41 
I 

GAGCGGGACT 
TGCAGGAGAC 
TCCGCGGCCT 
CATCTTTAGT 
GTATTGAATT 
AAATGGGCCA 
ATACAAAAGA 
TTCAGACTTT 
AATTCTATGG 
ATGAGCTCCT 
ACCTGTTCCG 
CCAAACTACC 
CTAAGTCCAT 
CAATTCGTCC 
TTGCCCTGCA 
AAGTCTTGTT 
CCCTGGAATC 
AAAATAAACT 
ACAACAAGGA 
TTATAAACGC 
TGTTCCTCAC 
AGTCTGTTGC 
TGGAGCACCT 
TGGTGTGTTG 
TCAGGAATTG 
TGGTCCTTCC 
GAACTGGCAA 
TGAG CTCTGA 
CCTCCAGTGA 
TTGTTGAGAA 
ATGAGGCGCC 
CTAAACCTAA 
CTGAGAAACA 
TGCAATCTAC 
GAAATGCCAA 
CTGAAGACCC 
CAGTTAAAAT 
CCTTGCCACA 
CTTTCAAACT 
AATGGTCAAT 
GCCTGGATGG 
TGTCAGCTCT 
AGAAGACAAA 
TAGTACAAAT 
CCTCAGATGA 
CAGTGCCCTT 
CAGAAT T AGC 
ATAGCATGGT 
CCCCACCCAT 
ATGTTGATCA 
TCACTAACAA 
ATGGAATTGT 
AATTCCTTAA 
ACAC C AAAT C 
GGCTGGGAGC 
TGGTGGAACA 
ATGCAGATGA 
GCCGCATCAT 
GAGGATTTCC 
AT TGTGGGAG 
TTCCTTTATT 
AAGGTGTCTC 
TCCTGGCCCA 
GCTGGCTGGA 



51 
I 

CGGCGGCATG 
CTTGTCCGCT 
GGGG CAGGAA 
TTTTTCCAGA 
TCGTGAATGT 
GAAGATCGCA 
TAGAGCTGCT 
TAGAAGTTCT 
AGAACTTGCA 
AGGATTATTG 
CGCTTTTCTG 
TGTTCTGGCA 
GGAAGAAGAT 
TCAGATTGAT 
TGCATCT CAG 
AAAGTGGTGT 
CTTTCTGAAA 
GCAGTACTTT 
GTTATCTATT 
AAAAGATGTT 
CCAGACAGAC 
AAGCGTCTTG 
CGTGGTGATG 
CAGAGCCATA 
CATTAGTACT 
AAAGGGCCCT 
ATGGAAGGTG 
CCAGATGATG 
AAGT CTGAAT 
ATTGGATCTT 
TGGTGTTTGG 
AGATTTTTCG 
AG CAGAATTT 
AAGGTTGCCC 
GAAAATAAAA 
AGAAAAGTAT 
GAAGCAGTAC 
CAACATCATT 
GGGCCTGAGC 
TTATATTGAC 
ATACCTGAAG 
TTCTCGGGCT 
GAACCTTTCA 
GCTTGGATCT 
GATGATGAAG 
TAGAGAGATG 
GCTCACAGCC 
TATGTTTATG 
GTACCAGCTC 
GGTGACAAGG 
CAAGAAATTT 
GGACCCTGTT 
ATGGTCCATT 
GCTTTTCAAG 
ATCACTTGCC 
GTTTGTGTTT 
GAAGTCCTTA 
TGAAAAGAAG 
ACCTTCCGCA 
GCCCCAGACA 
GCCAGGCAAC 
TTTTCTCATC 
GCCCACCCTC 
CCTGCTCCTG 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
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